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(57) Abstract 

A cellulose- or hemicellulose-degrading enzyme which is derivable from a fungus other than Trichoderma or Phanero- 
chaete, and which comprises a carbohydrate binding domain homologous to a terminal A region of Trichoderma reesei cellulases, 
which carbohydrate binding domain comprises amino acid sequence (a) or a subsequence thereof capable of effecting binding of 
the enzyme to an insoluble cellulosic or hemicellulosic substrate. 
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AN ENZYME CAPABLE OF DEGRADING CELLULOSE OR HEMI CELLULOSE 

FIELD OF INVENTION 

5 The present invention relates to a cellulose- or hemicellulose- 
degrading enzyme, a DNA construct boding for the enzyme, a 
method of producing the enzyme, and an agent for degrading 
cellulose or hemicellulose comprising the enzyme. 

10 BACKGROUND OF THE INVENTION 

Enzymes which are able to degrade cellulose have previously 
been suggested for- the conversion of biomass into liquid fuel, 
gas and feed protein. However, the production of fermentable 
15 sugars from biomass by means of cellulolytic enzymes is not yet 
able to compete economically with, for instance, the production 
of glucose from starch by means of a-amylase due to the 
inefficiency of the currently used cellulolytic enzymes. 
Cellulolytic enzymes may furthemore be used in the brewing 

2 0 industry for the degradation of 0-glucans, in the baking 

industry for improving the properties of flour, in paper pulp 
processing for removing the non-crystalline parts of cellulose, 
thus increasing the proportion of crystalline cellulose in the 
pulp, and in animal feed for improving the digestibility of 
25 glucans. A further important use of cellulolytic enzymes is for 
textile treatment, e.g. for reducing the harshness of cotton- 
containing fabrics (cf . , for instance, GB 1 368 599 or US 
4,435,307), for soil removal and colour clarification of 
fabrics (cf . , for instance, EP 220 016) or for providing a 

3 0 localized variation in colour to give the fabrics a "stone- 

washed" appearance (cf,, for instance, EP 307 564). 

The practical exploitation of cellulolytic enzymes has, to some 
extent, been set back by the nature of the known cellulase 
35 preparations which are often complex mixtures of a variety bf 
single cellulase components, and which may have a rather low 
specific activity. It is difficult to optimise the production 
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of single components in multiple enzyme systems and thus to 
implement industrial cost-effective production of cellulolytic 
enzymes, and their actual use has been hampered by difficulties 
arising from the need to employ rather large quantities of the 
5 enzymes to achieve the desired effect. 

The drawbacks of previously suggested cellulolytic enzymes may 
be remedied by using single-component enzymes selected for a 
high specific activity, 

10 

Single-component cellulolytic enzymes have been isolated from, 
e.g. Trichoderma reesei (cf. Teeri et al . , Gene 51 f 1987, pp. 
43-52; P.M. Abuja, Biochem. Biophvs. Res. Comm. 156, 1988, pp. 
180-185; and P.J. Kraulis, Biochemistry 28., 1989, pp. 7241- 

15 7257) . The reesei cellulases have been found to be composed 
of a terminal A region responsible for binding to cellulose, a 
B region linking the A region to the core of the enzyme, and a 
core containing the catalytically active domain. The A region 
of different T^_ reesei cellulases has been found to be highly 

20 conserved, and a strong homology has also been observed with a 
cellulase produced by Phanerochaete chrvsosporium (Sims et al. , 
Gene 74 f 1988, pp. 411-422). 

SUMMARY OF THE INVENTION 

25 

It has surprisingly been found that other fungi, which are not 
closely related to either Trichoderma reesei or Phanerochaete 
chrvso sporium r are capable of producing enzymes which contain 
a region which is homologous to the A region of T\_ reesei 
30 cellulases. 

Accordingly, the present invention relates to a cellulose- or 
hemicellulose-degrading enzyme which is derivable from a fungus 
other than Trichoderma or Phanerochaete : and which comprises a 
3 5 carbohydrate binding domain homologous to a terminal A region 
of Trichoderma reesei cellulases, which carbohydrate binding 
domain comprises the following amino acid sequence 
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1 10 

Xaa Xaa Gin Cys Gly Gly Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa Cys Xaa 

20 30 
5 Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Asn Xaa Xaa Tyr Xaa Gin Cys Xaa 

Xaa 



10 or a subsequence thereof capable of effecting binding .of the 
enzyme to an insoluble cellulosic or hemicellulosic substrate. 
"Xaa" is intended to indicate variations in the amino acid 
sequence of the carbohydrate binding domain of different 
enzymes. A hyphen is intended to indicate a "gap" in the amino 

15 acid sequence (compared to other, similar enzymes) . 

In the present context, the term "cellulose" is intended to 
include soluble and insoluble, amorphous and crystalline forms 
of cellulose. The term "hemicellulose" is intended to include 

20 glucans (apart from starch), mannans, xylans, arabinans or 
polyglucuronic or polygalacturonic acid. The term "carbohydrate 
binding domain" ( "CBD" ) is intended to indicate an amino acid 
sequence capable of effecting binding of the enzyme to a 
carbohydrate substrate, in particular cellulose or 

25 hemicellulose as defined above. The term "homologous" is 
intended to indicate a high degree of identity in the sequence 
of amino acids constituting the carbohydrate binding domain of 
the present enzyme and the amino acids constituting the A 
region found in x*. reesei cellulases ("A region" is the term 
30 used to denote the cellulose (i.e. carbohydrate) binding domain 
of T_s. reesei cellulases) . 

It is currently believed that cellulose- or hemicellulose- 
degrading enzymes which contain a sequence of amino acids which 

35 is identifiable as a carbohydrate binding domain (or "A region" 
based on its homology to the A region of 2L reesei cellulases 
possess certain desirable characteristics as a result of the 
function of the carbohydrate binding domain in the enzyme 
molecule which is to mediate binding to solid substrates 

40 (including cellulose) and consequently to enhance the activity 
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of such enzymes towards such substrates. The identification and 
preparation of carbohydrate binding domain-containing enzymes 
from a variety of microorganisms is therefore of considerable 
interest, 

5 

Cellulose- or hemicellulose-degrading enzymes of the invention 
may conveniently be identified by screening genomic or cDNA 
libraries of different fungi with a probe comprising at least 
part of the DNA encoding the A region of reesei cellulases. 

10 Due to the intraspecies (i.e. different Tj. reesei cellulases) 
and interspecies homology observed for the carbohydrate binding 
domains of different cellulose- or hemicellulose-degrading 
enzymes, there is reason to believe that this screening method 
constitutes a convenient way of isolating enzymes of current 

15 interest. 

DETAILED DISCLOSURE OF THE INVENTION 

Carbohydrate binding domain (CBD) containing enzymes of the 
2 0 invention may, in particular, be derivable from strains of 
Humicola . e.g. Humicola insolens . Fusarium , e.g. Fusarium 
oxvsporum , or Mvcel iopthora , e.g. Mycel iopthora thermophile . 

Some of the variations in the amino acid sequence shown above 
25 appear to be '^conservative" , i.e. certain amino acids are 
preferred in these positions among the various CBD-containing 
enzymes of the invention. Thus, in position 1 of the sequence 
shown above, the amino acid is preferentially Trp or Tyr. In 
position 2, the amino acid is preferentially Gly or Ala. In 
30 position 7, the amino acid is preferentially Gin, lie or Asn. 
In position 8, the amino acid is preferentially Gly or Asn. In 
position 9, the amino acid is preferentially Trp, Phe or Tyr. 
In position 10, the amino acid is preferentially Ser, Asn, Thr 
or Gin. In position 12, the amino acid is preferentially Pro, 
35 Ala or Cys. In position 13, the amino acid is preferentially 
Thr, Arg or Lys. In position 14, the amino acid is 
preferentially Thr, Cys or Asn. In position 18, the amino acid 
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is preferentially Gly or Pro. In position 19, the amino acid 
(if present) is preferentially Ser, Thr, Phe, Leu or Ala. In 
position 20, the amino acid is preferentially Thr or Lys. In 
position 24, the amino acid is preferentially Gin or lie. In 
5 position 26, the amino acid is preferentially Gin, Asp or Ala. 
In position 27, the amino acid is ; preferentially Trp, Phe or 
Tyr. In position 29, the amino acid is preferentially Ser, His 
or Tyr. In position 32, the amino acid is preferentially Leu, 
lie, Gin, Val or Thr. 

10 

Examples of specific CBD-containing enzymes of the invention 
are those which comprise one of the following amino acid 
sequences 

15 Trp Gly Gin Cys Gly Gly Gin Gly Trp Asn Gly Pro Thr Cys Cys Glu 
Ala Gly Thr Thr Cys Arg Gin Gin Asn Gin Trp Tyr Ser Gin Cys 
Leu; 



Trp Gly Gin Cys Gly Gly He Gly Trp Asn Gly Pro Thr Thr Cys Val 
20 Ser Gly Ala Thr Cys Thr Lys He Asn Asp Trp Tyr His Gin Cys 
Leu ; 

Trp Gly Gin Cys Gly Gly He Gly Phe Asn Gly Pro Thr Cys Cys Gin 
Ser Gly Ser Thr Cys Val Lys Gin Asn Asp Trp Tyr Ser Gin Cys 
25 Leu; 

Trp Gly Gin Cys Gly Gly Asn Gly Tyr . Ser Gly Pro Thr Thr Cys Ala 
Glu Gly - Thr Cys Lys Lys Gin Asn Asp Trp Tyr Ser Gin Cys Thr 
Pro ; 

30 

Trp Gly Gin Cys Gly Gly Gin Gly Trp Gin Gly Pro Thr Cys Cys Ser 
Gin Gly - Thr Cys Arg Ala Gin Asn Gin Trp Tyr Ser Gin Cys Leu 
Asn; 

3 5 Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Thr Asn Cys Glu 
Ala Gly Ser Thr Cys Arg Gin Gin Asn Ala Tyr Tyr Ser Gin Cys 
lie; 
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Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Arg Asn Cys Glu 
Ser Gly Ser Thr Cys Arg Ala Gin Asn Asp Trp Tyr Ser Gin Cys 
Leu; 

Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys Val 
5 Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
Leu ; ; 

Trp Gly Gin Cys Gly Gly Gin Asn Tyr Ser Gly Pro Thr Thr Cys Lys 
Ser Pro Phe Thr Cys Lys Lys lie Asn Asp Phe Tyr Ser Gin Cys 
10 Gin; or 

Trp Gly Gin Cys Gly Gly Asn Gly Trp Thr Gly Ala Thr Thr Cys Ala 
Ser Gly Leu Lys'Cys Glu Lys lie Asn Asp Trp Tyr Tyr Gin Cys Val 

15 The cellulose- or hemicellulose-degrading enzyme of the 
invention may further comprise an amino acid sequence which 
defines a linking B region (to use the nomenclature established 
for T_-. reesei cellulases) adjoining the carbohydrate binding 
domain and connecting it to the catalytically active domain of 

2 0 the enzyme. The B region sequences established so far for 
enzymes of the invention indicate that such sequences are 
characterized by being predominantly hydrophilic and uncharged, 
and by being enriched in certain amino acids, in particular 
glycine and/or asparagine and/or proline and/or serine and/or 

25 threonine and/or glutamine. This characteristic structure of 
the B region imparts flexibility to the sequence, in particular 
in sequences containing short, repetitive units of primarily 
glycine and asparagine. Such repeats are not found in the B 
region sequences of T^ reesei or P^. chrvsosporium which contain 

30 B regions of the serine/ threonine type. The flexible structure 
is believed to facilitate the action of the catalytically 
active domain of the enzyme . bound by the A region to the 
insoluble substrate, and therefore imparts advantageous 
properties to the enzyme of the invention. 

35 

Specific examples of B regions contained in enzymes of the 
invention have the following amino acid sequences 
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Ala Arg Thr Asn Val Gly Gly Gly Ser Thr Gly Gly Gly Asn Asn Gly 
Gly Gly Asn Asn Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Pro 
Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Cys 
Ser Pro Leu; 

5 

Pro Gly Gly Asn Asn Asn Asn Pro Pro Pro Ala Thr Thr Ser Gin Trp 
Thr Pro Pro Pro Ala Gin Thr Ser Ser Asn Pro Pro Pro Thr Gly Gly 
Gly Gly Gly Asn Thr Leu His Glu Lys; 

10 

Gly Gly Ser Asn Asn Gly Gly Gly Asn Asn Asn Gly Gly Gly Asn Asn 
Asn Gly Gly Gly Gly Asn Asn Asn Gly Gly Gly Asn Asn Asn Gly Gly 
Gly Asn Thr Gly Gly Gly Ser Ala Pro Leii; 

15 Val Phe Thr Cys Ser Gly Asn Ser Gly Gly Gly Ser Asn Pro Ser Asn 
Pro Asn Pro Pro Thr Pro Thr Thr Phe lie Thr Gin Val Pro Asn Pro 
Thr Pro Val Ser Pro Pro Thr Cys Thr Val Ala Lys; 

Pro Ala Leu Trp Pro Asn Asn Asn Pro Gin Gin Gly Asn Pro Asn Gin 
2 0 Gly Gly Asn Asn Gly Gly Gly Asn Gin Gly Gly Gly Asn Gly Gly Cys 
Thr Val Pro Lys; 

Pro Gly Ser Gin Val Thr Thr Ser Thr Thr Ser Ser Ser Ser Thr Thr 
Ser Arg Ala Thr Ser Thr Thr Ser Ala Gly Gly Val Thr Ser lie Thr 

2 5 Thr Ser Pro Thr Arg Thr Val Thr lie Pro Gly Gly Ala Ser Thr Thr 

Ala Ser Tyr Asn; 

Glu Ser Gly Gly Gly Asn Thr Asn Pro Thr Asn Pro Thr Asn Pro Thr 
Asn Pro Thr Asn Pro Thr Asn Pro Trp Asn Pro Gly Asn Pro Thr Asn 

3 0 Pro Gly Asn Pro Gly Gly Gly Asn Gly Gly Asn Gly Gly Asn Cys Ser 

Pro Leu; or 



Pro Ala Val Gin lie Pro Ser Ser Ser Thr Ser Ser Pro Val Asn Gin 
Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr Ser Ser Pro Pro 
3 5 Val Gin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu Arg 
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In another aspect, the present invention relates to a 
carbohydrate binding domain homologous to a terminal A region 
of Trichoderma reesei cellulases, which carbohydrate binding 
domain comprises the following amino acid sequence 

5 

1 1 10 

Xaa Xaa Gin Cys Gly Gly Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa Cys Xaa 

20 30 
10 Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Asn Xaa Xaa Tyr Xaa Gin Cys Xaa 

Xaa 

15 or a subsequence thereof capable of effecting binding of a 
protein to an insoluble cellulosic or hemicellulosic substrate. 

Examples of specific carbohydrate binding domains are those 
with the amino acid sequence indicated above, 

20 

In a further aspect, the present invention relates to a linking 
B region derived from a cellulose- or hemicellulose-degrading 
enzyme, said region comprising an amino acid sequence enriched 
in the amino acids glycine and/or asparagine and/or proline 

2 5 and/or serine and/or threonine and/or glutamine. As indicated 

above, these amino acids may often occur in short, repetitive 
units. Examples of specific B region sequences are those shown 
above. 

3 0 The present invention provides a unique oppportunity to 

"shuffle" the various regions of different cellulose- or 
hemicellulose-degrading enzymes, thereby creating novel 
combinations of the CBD, B region and catalytically active 
domain resulting in novel activity profiles of this type of 

35 enzymes. Thus, the enzyme of the invention may be one which 
comprises an amino acid sequence defining a CBD, which amino 
acid sequence, is derived from one naturally occurring 
cellulose- or hemicellulose-degrading enzyme, an amino acid 
sequence defining a linking B region, which amino acid sequence 

40 is derived from another naturally occurring cellulose- or 
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hemicellulose-degrading enzyme, as well as a catalytically 
active domain derived from the enzyme supplying either the CBD 
or the B region or from a third enzyme. In a particular 
embodiment , the catalytically active domain is derived from an 
5 enzyme which does not, in nature, comprise any CBD or B region. 
In this way, it is possible to construct enzymes with improved 
binding properties from enzymes which lack the CBD and B 
regions. 

10 The enzyme of the invention is preferably a cellulase such as 
an endoglucanase (capable of hydrolysing amorphous regions of 
low crystallinity in cellulose fibres) , a cellobiohydrolase 
(also known as. " an exoglucanase, capable • of initiating 
degradation of cellulose from the non-reducing chain ends by 

15 removing cellobiose units) or a 0-glucosidase. 

In a still further aspect, the present invention relates to a 
DNA construct which comprises a DNA sequence encoding a 
cellulose- or hemicellulose-degrading enzyme as described 
20 above. 

A DNA sequence encoding the present enzyme may, for instance, 
be isolated by establishing a cDNA or genomic library of a 
microorganism known to produce cellulose- or hemicellulose- 

25 degrading enzymes, such as a strain of Humicola , Fusarium or 
Mvcelo pthora . and screening for positive clones by conventional 
procedures such as by hybridization to oligonucleotide probes 
synthesized on the basis of the full or partial amino acid 
sequence of the enzyme or probes based on the partial or full 

3 0 DNA sequence of the A region from Tj. r eesei cellulases, as 
indicated above, or by selecting for clones expressing the 
appropriate enzyme activity, or by selecting for clones 
producing a protein which is reactive with an antibody raised 
against a native cellulose- or hemicellulose-degrading enzyme. 

35 

Alternatively, the DNA sequence encoding the enzyme may ^be 
prepared synthetically by established standard methods, e.g. 
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the phosphoamidite method described by S.L. Beaucage and M. H . 
Caruthers, Tetrahedron Letters 22, 1981, pp. 1859-1869, or the 
method described by Matthes et al., The EMBO J. 3, 1984, pp. 
8 01-8 05, According to the phosphoamidite method, 
5 oligonucleotides are synthesized,, e.g. in an automatic DNA 
synthesizer, purified, annealed, ligated and cloned in 
appropriate vectors. 

Finally, the DNA sequence may be of mixed genomic and 
10 synthetic, mixed synthetic and cDNA or mixed genomic and cDNA 
origin prepared by ligating fragments of synthetic, genomic or 
cDNA origin (as appropriate) , the fragments corresponding to 
various parts of the entire DNA construct, in accordance with 
standard techniques. Thus, it may be envisaged that a DNA 
15 sequence encoding the CBD of the enzyme may be of genomic 
origin, while the DNA sequence encoding the B region of the 
enzyme may be of synthetic origin, or vice versa ; the DNA 
sequence encoding the catalytically active domain of the enzyme 
may conveniently be of genomic or cDNA origin. The DNA 
2 0 construct may also be prepared by polymerase chain reaction 
using specific primers, for instance as described in US 
4,683,202 or R.K. Saiki et al., Science 239, 1988, pp. 487-491. 

The present invention also relates to an expression vector 
25 which carries an inserted DNA construct as described above. The 
expression vector may suitably comprise appropriate promotor, 
operator and terminator sequences permitting the enzyme to be 
expressed in a particular host organism, as well as an origin 
of replication enabling the vector to replicate in the host 
30 organism in question. 

The resulting expression vector may then be transformed into a 
suitable host cell, such as a fungal cell, a preferred example 
of which is a species of Aspergillus , most preferably 
35 Aspergillus orvzae or Aspergillus niger . Fungal cells may be 
transformed by a process involving protoplast formation and 
transformation of the protoplasts followed by regeneration of 
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the cell wall in a manner known per se . The use of Aspergillus 
as a host microorganism is described in EP 238, 023 (of Novo 
Industri A/S),, the contents of which are hereby incorporated by 
reference. 

5 

Alternatively, the host organisms may be a bacterium, in 
particular strains of Streptomvces and Bacillus , and E. coli. 
The transformation of bacterial cells may be performed 
according to conventional methods, e.g. as described in 
10 Sambrook et al., Molecular Cloning: A Laboratory Manual r Cold 
Spring Harbor, 1989. 

The screening of -appropriate DNA sequences and construction of 
vectors may also be carried out by standard procedures, cf. 
15 Sambrook et al. , op. cit. 

The invention further relates to a method of producing a 
cellulose- or hemicellulose-degrading enzyme as described 
above, wherein a cell transformed with the expression vector of 

20 the invention is cultured under conditions conducive to the 
production of the enzyme, and the enzyme is subsequently 
recovered from the culture. The medium used to culture the 
transformed host cells may be any conventional medium suitable 
for growing the host cells in question. The expressed enzyme 

25 may conveniently be secreted into the culture medium and may be 
recovered therefrom by well-known procedures including 
separating the cells from the medium by centrifugation or 
filtration, precipitating proteinaceous components of the 
medium by means of a salt such as ammonium sulphate, followed 

3 0 by chromatographic procedures such as ion exchange 
chromatography, affinity chromatography, or the like. 

By employing recombinant DNA techniques as indicated above, 
techniques of fermentation and mutation or other techniques 
3 5 which are well known in the art, it is possible to provide 
cellulose- or hemicellulose-degrading enzymes of a high purity 
and in a high yield. 
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The present invention further relates to an agent for degrading 
cellulose or hemicellulose, the agent comprising a cellulose- 
or hemicellulose-degrading enzyme as described above. It is 
contemplated that, dependent on the specificity of the enzyme, 
5 it may be employed for one (or possibly more) of the 
applications mentioned above. In a particular embodiment, the 
agent may comprise a combination of two or more enzymes of the 
invention or a combination of one or more enzymes of the 
invention with one or more other enzymes with cellulose- or 
10 hemicellulose-degrading activity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows the construction of plasmid p SX224; 

Fig. 2 shows the construction of plasmid pHW485; 
15 Fig. 3 shows the construction of plasmid pHW697 and pHW704; 

Fig. 4 shows the construction of plasmid pHw768; 

Fig. 5 is a restriction map of plasmid pSX320; 

Fig. 6 shows the construction of plasmid pSX777 

Fig. 7 shows the construction of plasmid pCaHjl7 0; 
2 0 Fig. 8 shows the construction of plasmid IM4 ; 

Fig. 9 shows the SOE fusion of the -4 3kD endoglucanase signal 

peptide and the N-terminal of Endol; 

Fig. 10 shows the construction of plasmid pCaHjlSO; 

Fig. 11 shows the DNA sequence and derived amino acid sequence 

2 5 of F. oxvsporum C-family cellobiohydrolase ; 

Fig. 12 shows the DNA sequence and derived amino acid sequence 
of F . oxvsporum F-family cellulase; 

Fig. 13 shows the DNA sequence and derived amino acid sequence 
of F. oxvsporum C-family endoglucanase; 
30 Fig. 14.A-E whows the DNA sequence and derived amino acid 
sequence of H. insolens endoglucanase 1(EG1); and 
Fig. 15A-D shows the DNA sequence and derived amino acid 
sequence of a fusion of the B.lautus (NCIMB 40250) Endo 1 
catalytic domain and the CBD and B region of H. insolens ~43kD 

3 5 endoglucanase. 
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The invention is further illustrated in the following examples 
which are not in any way intended to limit the scope of the 
invention as claimed, 

5 Example 1 

Isolation of A region-containing clones from H. insolent 



From H. insolens strain DSM 1800 . (described in, e.g. wo 
10 89/09259) grown on cellulose, mRNA was prepared according to 
the method described by Koplan et al., Biochem. J. 183 (1979) 
181-184. A cDNA library containing 20,000 clones was obtained 
substantially by "the method of Okayama and Berg, ' Methods in 
Enzvmoloay 154, 1987, pp.. 3-29. 

15 

The cDNA library was screened as described by Gergen et al . , 

Nucl. Acids Res. 7 (8), 1979, pp. 2115-2136, with 

oligonucleotide probes in the antisense configuration, designed 
according to the published sequences of the N-terminal part of 
20 the A-region of the four reesei cellulase genes (Penttila et 
al., Gene 45 (1986), 253-63; Saloheimo et al., Gene 63, (1988), 
11-21; Shoemaker et al . , Biotechnology, October 1983, 691-696; 
Teeri et al., Gene 51 (1987) 4 3-52. The probe sequences were as 
follows: 

25 

NOR-804 5'-CTT GCA CCC GCT GTA CCC AAT GCC ACC GCA CTG CCC 
(- EG 1) CCA-3 1 

NOR-805 5»-CGT GGG GCC GCT GTA GCC AAT ACC GCC GCA CTG GCC 
(~CBH 1) GTA-3 • 

3 0 NOR-8 07 5'-AGT CGG ACC CGA CCA ATT CTG GCC ACC ACA TTG GCC 
(-CBH 2) CCA-3 ■ 

NOR-808 5»-CGT AGG TCC.GCT CCA ACC AAT ACC TCC ACA CTG GCC 
(-EG 3) CCA-3 1 

35 Screening yielded a large number of candidates hybridising well 
to the A-region probes. Restriction mapping reduced the number 
of interesting clones to 17, of which 8 have so far been 
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sequenced (as described by Haltiner et al . , Nucl. Acids Res, 
13., 1985, pp. 1015-1025) sufficiently to confirm the presence 
of a terminal CBD as well as a B-region. 

5 The deduced amino acid sequences qbtained for the CBDs were as 
follows 

A-l: Trp Gly Gin Cys Gly Gly Gin Gly Trp Asn Gly Pro Thr Cys 

Cys Glu Ala Gly Thr Thr Cys Arg Gin Gin Asn Gin Trp Tyr Ser Gin 
10 Cys Leu; 

A-5: Trp Gly _Gln Cys Gly Gly lie Gly Trp Asn Gly Pro Thr 

Cys Val Ser Gly Ala Thr Cys Thr Lys lie Asn Asp Trp Tyr His 
Cys Leu; 

CBH-2 : Trp Gly Gin Cys Gly Gly lie Gly Phe Asn Gly Pro Thr 
Cys Gin Ser Gly Ser Thr Cys Val Lys Gin Asn Asp Trp Tyr Ser 
Cys Leu ; 

20 A-8: Trp Gly Gin Cys Gly Gly Asn Gly Tyr Ser Gly Pro Thr 

Cys Ala Glu Gly - Thr Cys Lys Lys Gin Asn Asp Trp Tyr Ser 
Cys Thr Pro; 

A-9: Trp Gly Gin Cys Gly Gly Gin Gly Trp Gin Gly Pro Thr 

25 Cys Ser Gin Gly - Thr Cys Arg Ala Gin Asn Gin Trp Tyr Ser 
Cys Leu Asn; 

A- 11: Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Thr Asn 
Cys Glu Ala Gly Ser Thr Cys Arg Gin Gin Asn Ala Tyr Tyr Ser Gin 
3 0 Cys lie; 



Thr 
Gin 



Cys 
Gin 



Thr 
Gin 



Cys 
Gin 



35 



A-19: Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys 
Cys Glu Ser Gly Ser Thr Cys Arg Ala Gin Asn Asp Trp Tyr 
Cys Leu ; and 



Arg Asn 
Ser Gin 
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"43 kD: Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr 
Cys Val Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin 
Cys Leu 

5 The deduced amino acid sequences obtained for the B region were 
as follows 

Al: Ala Arg Thr Asn Val Gly Gly Gly Ser Thr Gly Gly Gly Asn 
Asn Gly Gly Gly Asn Asn Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly 
10 Asn Pro Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly 
Asn Cys Ser Pro Leu; 

A5: Pro Gly Gly Asn Asn Asn Asn Pro Pro Pro Ala Thr Thr Ser 
Gin Trp Thr Pro Pro Pro Ala Gin Thr Ser Ser Asn Pro Pro Pro Thr 
15 Gly Gly Gly Gly Gly Asn Thr Leu His Glu Lys; 

A8: Gly Gly Ser Asn Asn Gly Gly Gly Asn Asn Asn Gly Gly Gly 
Asn Asn Asn Gly Gly Gly Gly Asn Asn Asn Gly Gly Gly Asn Asn Asn 
Gly Gly Gly Asn Thr Gly Gly Gly Ser Ala Pro Leu; 

20 

All: Val Phe Thr Cys Ser Gly Asn Ser Gly Gly Gly Ser Asn Pro 
Ser Asn Pro Asn Pro Pro Thr Pro Thr Thr Phe lie Thr Gin Val Pro 
Asn Pro Thr Pro Val Ser Pro Pro Thr Cys Thr Val Ala Lys; 

25 A19: Pro Ala Leu Trp Pro Asn Asn Asn Pro Gin Gin Gly Asn Pro 
Asn Gin Gly Gly Asn Asn Gly Gly Gly Asn Gin Gly Gly Gly Asn Gly 
Gly Cys Thr Val Pro Lys; 

CBH2: Pro Gly Ser Gin Val Thr Thr Ser Thr Thr Ser Ser Ser Ser 
30 Thr Thr Ser Arg Ala Thr Ser Thr Thr Ser Ala Gly Gly Val Thr Ser 
lie Thr Thr Ser Pro Thr Arg Thr Val Thr lie Pro Gly Gly Ala Ser 
Thr Thr Ala Ser Tyr Asn; 

A9: Glu Ser Gly Gly Gly Asn Thr Asn Pro Thr Asn Pro Thr Asn 
35 Pro Thr Asn Pro Thr Asn Pro Thr Asn Pro Trp Asn Pro Gly Asn Pro 
Thr Asn Pro Gly Asn Pro Gly Gly Gly Asn Gly Gly Asn Gly Gly Asn 
Cys Ser Pro Leu; or 
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Pro Ala Val Gin lie Pro Ser Ser Ser Thr Ser Ser Pro Val Asn Gin 
Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr Ser Ser Pro Pro 
Val Gin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu Arg 

5 . 
Example 2 

Express ion in A. oryzae of a CBH 2^tvoe cellulase from H. 
insolens 

10 

The complete sequence of one of the CBD clones, shows a striking 
similarity to acellobiohydrolase (CBH, 2) from T. reesei . 

The construction of the expression vector pSX2 2 4 carrying the 

15 H*_ insolens CBH 2 gene for expression in and secretion from A. 
oryzae is outlined in Fig. 1. The vector p777 containing the 
pUC 19 replicon and the regulatory regions of the TAKA amylase 
promoter from A. orvzae and glucoamylase terminator from A. 
niaer is described in EP 238 023. pSX 217 is composed of the 

2 0 cloning vector pcDVl-pLl (cf. Okayama and Berg, op. cit. ) 
carrying the H^ insolens CBH 2 gene on a 1 . 8 kb fragment. The 
CBH 2 gene contains three restriction sites used in the 
construction: A Ball site at the initiating methionine codon in 
the signal sequence, a .BstBI site 620 bp downstream from the 

25 Ball site and an Avail site 860 bp downstream from the BstBI 
site. The Avail site is located in the non-translated C- 
terminal part of the gene upstream of the poly A region , which 
is not wanted in the final construction. Nor is the poly G 
region upstream of the gene in the cloning vector. This region 

30 is excised and replaced by an oligonucleotide linker which 
places the translational start codon close to the BamHI site at 
the end of the TAKA promoter. 

The expression vector pSX .224 was transformed into A. orvzae 
3 5 IFO 4177 using the amdS gene from A. nidulans as the selective 
marker as described in EP 238 023. Transf ormants were grown in 
YPD medium (Sherman et al.. Methods in Yeast Genetics, Cold 
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spring Harbor Laboratory, 1981) for 3-4 days and analysed for 
new protein species in the supernatant by sodium dodecyl 
sulphate polyacrylamide gel electrophoresis. The CBH 2 from IL 
insolens formed a band with an apparent Mw of 65 kD indicating 
5 a substantial glycosylation of the , protein chain, which is 
calculated to have a Mw of 51 kD on the basis of the amino acid 
composition. The intact enzyme binds well to cellulose, while 
enzymatic degradation products of 55 kD and 40 kD do not bind, 
indicating removal of the A-region and possibly the B-region. 
10 The enzyme has some activity towards filter paper, giving rise 
to release of glucose. As expected, it has very limited 
endoglucanase activity as measured on soluble cellulose in the 
form of carboxy methyl cellulose. 

15 Example 3 

Isolation of Fusarium oxysporum genomic DNA 

A freeze-dried culture of Fusarium oxysporum. was reconstituted 

2 0 with phosphate buffer, spotted 5 times on each of 5 FOX medium 

plates (6% yeast extract, 1.5% K 2 HP0 4 , 0.75% MgS0 4 7H 2 0, 22.5% 
glucose, 1.5% agar, P H 5.6) and incubated at 37'C. After 6 days 
of incubation the colonies were scraped from the plates into 15 
ml of 0.001% Tween-80 which resulted in a thick and cloudy 
25 suspension. 

Four 1-liter flasks, each containing 300 ml of liquid FOX 
medium, were inoculated with 2 ml of the spore suspension and 
were incubated at 30 'C and 240 rpm. On the 4th day of 

3 0 incubation, the cultures were filtered through 4 layers of 

sterile gauze and washed with sterile water. The mycelia were 
dried on Whatman filter paper, frozen in liquid nitrogen, 
ground into a fine powder in a cold mortar and added to 75 ml 
of fresh lysis buffer (10 mM Tris-Cl 7.4, 1% SDS, 50 mM EDTA, 
35 100 nl DEPC). The thoroughly mixed suspension was incubated in 
a 65 'C waterbath for 1 hour and then spun for 10 minutes at 
4000 rpm and 5>C in a bench-top centrifuge. The supernatant 
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was decanted and EtOH precipitated. After 1 hour on ice the 
solution was spun at 19,000 rpm for 2 0 minutes. The supernatant 
was decanted and isopropanol precipitated. Following 
centrifugation at 10,000 rpm for 10 minutes, the supernatant 
5 was decanted and the pellets allowed to dry. 

One milliliter of TER solution (10 mM Tris-HCl, p h 7.4, l mM 
EDTA, 100 /ig RNAse A) was added to each tube, and the tubes 
were stored at 4'C for two days. The tubes were pooled and 

10 placed in a 65 'C waterbath for 30 minutes to suspend non- 
dissolved DNA. The solution was extracted twice with 
phenol/CHCl 3 /isoamyl alcohol, twice with CHci 3 /isoamyl alcohol 
and then ethanol. precipitated. The pellet was allowed to settle 
and the EtOH was removed. 70% EtOH was added and the DNA stored 

15 overnight at -20'C. After decanting and drying, 1 ml of TER 
was added and the DNA was dissolved by incubating the tubes at 
65*C for 1 hour. The preparation yielded 1.5 mg of genomic DNA. 

Amplification, cloning and sequencing of DNA amplified with 
2 0 degenerate primers 

To amplify DNA from C-family (according to the nomenclature of 
Henrissat et al. Gene 81 (1), 1989, pp. 83-96) cellulases using 
PCR (cf. US 4,683,195 and US 4,683,202) each "sense" 
25 oligonucleotide was used in combination with each "antisense" 
oligonucleotide. Thus, the following primer pair was used: 

Primer 1 Primer 2 

ZC3220 ZC3221 

3 0 

ZC3220 : GCC AAC TAC GGT ACC GG(A/C/G/T) TA(C/T) TG(C/T) 

GA(C/T) (A/G/T) (C/G) (A/G/C/T) CA(G/A) TG 



ZC3221: GCG TTG GCC TCT AGA AT(G/A) TCC AT(C/T) TC (A/G/C/T) 

35 (C/G/T) (A/T) (G/A) CA(G/A) CA 
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In the PCR reaction, 1 /xg of Fusarium oxvsporum genomic DNA was 
used as the template. Ten times PCR buffer is lOOmM Tris-HCl pH 
8.3, 500 mM KC1, 15 mM MgCl , 0.1% gelatin (Perkin-Elmer Cetus) . 
The reactions contained the following ingredients: 

5 

dH20 35.75 
10X PCR buffer 5 /xl 

template DNA 5 /xl 

primer 1 2 /xl (4 0pmol) 

10 primer 2 2 /xl (40 pmol) 

Tag polymerase 0.25 jxl (1.25 U) 

total 50 /xl 



The PCR reactions were performed for 4 0 cycles under the 
15 following conditions: 

94°C 1.5 min 

45° 2.0 min 

72° 2.0 min 

20 Five microliters of each reaction was analyzed by agarose gel 
electrophoresis. The sizes of the DNA fragments were estimated 
from DNA molecular weight markers. The reacton primed with 
ZC3220 and ZC3221, produced two DNA fragments of appropriate 
size to be candidates for fragments of C-family cellulases. The 
25 agarose sections containing these two fragments were excised, 
and the DNA was electroeluted and digested with the restriction 
enzymes Kpnl and Zbal. The fragments were ligated into the 
vector pUC18 which had been cut with the same two restriction 
enzymes. The ligations were transformed into E. coli and mini- 
3 0 prep DNA was prepared from the resulting colonies. The DNA 
sequences of these inserts were determined and revealed that 
two new C-family cellulases had been identified, one a new 
cellobiohydrolase and the other a new endoglucanase. 
The PCR cloning strategy described above for the C-family 
35 cellulases was applied using other primers which encoded 
conserved cellulase sequences within the known F-family 
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cellulases (cf. Henrissat et al., o£. cit . ) The following 
primer pair was used for amplification of Fusarium genomic DNA. 

Primer 1 Primer 2 

5 ZC3226 ZC3227 

ZC3226 : TCC TGA CGC CAA GCT TT (A/G/T) (C/T) (A/T) (A/T) 

(A/C/T)AA (C/T)GA (C/T)TA (C/T)AA 

10 ZC3227 : CAC CGG CAC CAT CGA T(G/A/)T C(A/C/G/T)A 

(G/A) (C/T)T C(A/G/C/T)G T (A/G/T) A T 

The PCR reactions were performed for 4 0 cycles as follows: 

15 94 °C 1.5 min 

50'C 2.0 min 

72 °C 2.0 min 

The 180 bp band was eluted from an agarose gel fragment, 

2 0 digested with the restriction enzymes Hind III and Cla I and 

ligated into pUC19 which had been digested with Hind III and 
Accl. The ligated DNA was transformed into E. coli and mini- 
prep DNA was prepared from colony isolates. The DNA sequence of 
the cloned DNA was determined. This fragment encoded sequences 
25 corresponding to a new member of the F-family cellulases. 

Construction of a Fusarium oxvsporum cDNA library 

Fusarium oxvsporum was grown by fermentation and samples were 

3 0 withdrawn at various times for RNA extraction and cellulase 

activity analysis. The activity analysis included an assay for 
total cellulase activity as well as one for colour 
clarification. Fusarium oxvsporum samples demonstrating maximal 
colour clarification were extracted for total RNA from which 
3 5 poly(A)+RNA was isolated. 
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To construct a Fusarium oxvsporum cDNA library, first-strand 
cDNA was synthesized in two reactions, one with and the other 
without radiolabeled dATP. A 2.5X reaction mixture was 
prepared at room temperature by mixing the following reagents 
5 in the following order: 10 fil of 5X reverse transcriptase 
buffer (Gibco-BRL, Gaithersburg, 'Maryland) 2.5 /xl 200 mM 
dithiothreitol (made fresh or from a stock solution stored at - 
70 °C) , and 2.5 pi of a mixture containing 10 mM of each 
deoxynucleotide triphosphate, (dATP, dGTP, dTTP arid 5-methyl 
10 dCTP, obtained from Pharmacia LKB Biotechnology, Alameda, CA) . 
The reaction mixture was divided into each of two tubes of 7 . 5 
/tl. 1-3 fj.1 of 10 fiCi/fil 32 P a-dATP (Amersham, Arlington 
Heights, IL) was erdded to one tube and 1.3 /il of water to the 
other. Seven microliters of each mixture was transferred to 
15 final reaction tubes. In a separate tube, 5 /*g of Fusarium 
oxysporum poly (A) + RNA in 14 fil of 5 mM Tris-HCl pH 7.4, 50 ^M 
EDTA was mixed with 2 /il of 1 nq/fil first strand primer (ZC2938 
GACAGAGCACAGAATTCACTAGTGAGCTCT 15 ) . The RNA-primer mixture was 
heated at 65 °C for 4 minutes, chilled in ice water, and 
20 centrifuged briefly in a microfuge. Eight microliters of the 
RNA-primer mixture was added to the final reaction tubes. Five 
microliters of 200 U//xl Superscript™ reverse transcriptase 
(Gibco-BRL) was added to each tube. After gentle agitation, the 
tubes were incubated at 45 °C for 30 minutes. Eighty microliters 
25 of 10 mM Tris-HCl pH 7.4, 1 mM EDTA was added to each tube, the 
samples were vortexed, and briefly centrifuged. Three 
microliters was removed from each tube to determine counts 
incorporated by TCA precipitation and the total counts in the 
reaction. A 2 /il sample from each tube was analyzed by gel 
30 electrophoresis. The remainder of each sample was ethanol 
precipitated in the presence of oyster glycogen. The nucleic 
acids were pelleted by centrifugation, and the pellets were 
washed with 80% ethanol. Following the ethanol wash, the 
samples were air dried for 10 minutes. The first strand 
35 synthesis yielded 1.6 fig of Fusarium oxvsporum cDNA, a 33% 
conversion of poly(A)+RNA into DNA. 
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Second strand cDNA synthesis was performed on the RNA-DNA 
hybrid from the first strand reactions under conditions which 
encouraged first strand priming of second strand synthesis 
resulting in hairpin DNA. The first strand products from each 
5 of the two first strand reactions were resuspended in 71 fil of 
water. The following reagents were added, at room temperature, 
to the reaction tubes: 2 0 fil of 5X second strand buffer (100 mM 
Tris pH 7.4, 450 mM KCl , 23 mM MgCl 2 , and 50 mM (NH 4 ) 2 (S0 4 ), 3 
/xl of 5 mM 0-NAD, and °pl of a deoxynucleotide triphosphate 

10 mixture with each at 10 mM. One microliter of a- 32 P dATP was 
added to the reaction mixture which received unlabeled dATP for 
the first strand, synthesis while the tube which received 
labeled dATP for first strand synthesis received 1 jrl of water. 
Each tube then received 0.6 fil of 7 U//xl E. coli DNA ligase 

15 (Boehringer-Mannheim, Indianapolis, IN), 3.1 Ail of 8 U/pl L., 
coli DNA polymerase I (Amersham) , and 1 ^1 2 U/^l of RNase H 
(Gibco-BRL) . The reactions were incubated at 16 "C for 2 hours. 
After incubation, 2jul from each reaction was used to determine 
TCA precipitable counts and total counts in the reaction, and 

20 2 fil from each reaction was analyzed by gel electrophoresis. To 
the remainder of each sample, 2 fil of 2.5 ftg/fil oyster 
glycogen, 5 /il of 0.5 EDTA and 200 pi of 10 mM Tris-HCl pH 7.4, 
1 mM EDTA were added. The samples were phenol-chloroform 
extracted and isopropanol precipitated. After centrifugation 

25 the pellets were washed with 100 fil of 80% ethanol and air 
dried.. The yield of double stranded cDNA in each of the 
reactions was approximately 2.5 \x.g. 

Mung bean nuclease treatment was used to clip the single- 
30 stranded DNA of the hair-pin. Each cDNA pellet was resuspended 
in 15 pi of water and 2.5 fil of 10X mung bean buffer (0.3 M 
NaAc pH 4.6, 3 M NaCl , and 10 mM ZnS0 4 ) , 2.5 fJ.1 of 10 mM DTT, 
2.5 Atl of 50% glycerol, and 2.5 /xl of 10 U//xl mung bean 
nuclease (New England Biolabs, Beverly, MA) were added to each 
35 tube. The reactions were incubated at 30 °C for 30 minutes and 
75 /il of 10 mM Tris-HCl pH 7.4 and 1 mM EDTA was added to each 
tube. Two-microliter aliguots were analyzed by alkaline agarose 
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gel analysis. One hundred microliters of 1 M Tris-HCl pH 7.4 
was added to each tube and the samples were phenol-chloroform 
extracted twice. The DNA was isopropanol precipitated and 
pelleted by centrif ugation. After centrif ugation, the DNA 
5 pellet was washed with 8 0% ethanol and air dried. The yield was 
approximately 2 /xg of DNA from each 1 of the two reactions. 

The cDNA ends were blunted by treatment with T4 DNA polymerase. 
DNA from the two samples were combined after resuspension in a 

10 total volume of 24 fil of water. Four microliters of 10X T4 
buffer (330 mM Tris-acetate pH 7.9, 670 mM KAc, 100 mM MgAc, 
and 1 mg/ml gelatin), 4 of 1 mM dNTP, 4 /xl S'O ' mM DTT, and 4 
fil of 1 XJ/fxl T4 DNA-polymerase (Boehringer-Mannheim) were added 
to the DNA. The samples were incubated at 15°C for 1 hour. 

15 After incubation, 160 ^,1 of 10 mM Tris-HCl pH 7.4, 1 mM EDTA 
was added, and the sample was phenol-chloroform extracted. The 
DNA was isopropanol precipitated and pelleted by 
centrif ugation. After centrif ugation the DNA was washed with 
80% ethanol and air dried. 

20 

After resuspension of the DNA in 6.5 fil water, Eco RI adapters 
were added to the blunted DNA. One microliter of 1 fig/fil Eco RI 
adapter (Invitrogen, San Diego, CA Cat. # N409-20) , 1 /xl of 10X 
ligase buffer (0.5 M Tris pH 7.8 and 50 mM MgCl 2 ) , 0.5 /il of 10 
25 mM ATP, 0.5 /Ml of 100 mM DTT, and 1 jil of 1 U//xl T4 DNA ligase 
(Boehringer-Mannheim) were added to the DNA. After the sample 
was incubated overnight at room temperature, the ligase was 
heat denatured at 65 °C for 15 minutes. 

3 0 The Sst I cloning site encoded by the first strand primer was 
exposed by digestion with Sst I endonuclease. Thirty-three 
microliters of water, 5 /xl of 10X Sst I buffer (0.5 M Tris pH 
8.0, 0.1 M MgCl 2 , and 0.5 M NaCl) , and 2 ^1 of 5 U/£tl Sst I 
were added to the DNA, and the samples were incubated at 37 °C 

3 5 for 2 hours. One hundred and fifty microliters of 10 mM Tris- 
HC1 pH 7.4, 1 mM EDTA was added, the sample was phenol- 
chloroform extracted, and the DNA was isopropanol precipitated. 
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The cDNA was chromatographed on a Sepharose CL 2B (Pharmacia 
LKB Biotechnology) column to size-select the cDNA and to remove 
free adapters. A 1.1 ml column of Sepharose CL 2B was poured 
into a 1 ml plastic disposable pipet and the column was washed 
5 with 50 column volumes of. buffer ; (10 mM Tris pH 7.4 and 1 mM 
EDTA) . The sample was applied, one-drop fractions were 
collected, and the DNA in the void volume was pooled. The 
fractionated DNA was isopropanol precipitated. After 
centrifugation the DNA was washed with 8 0% ethanol and air 
10 dried. 

A Fusarium oxvsporum cDNA library was established by ligating 
the cDNA to the vector pYcDE8 1 (cf. WO 90/10698) which had been 
digested with Eco RI and Sst I., Three hundred and ninety 

15 nanograms of vector was ligated to 400 ng of cDNA in a 80 /xl 
ligation reaction containing 8 fil of 10 X ligase buffer, 4 ^1 
of 10 mM ATP, 4 /il 200 mM DTT, and 1 unit of T4 DNA ligase 
( Boehringer-Mannheim . After overnight incubation at room 
temperature, 5 /xg of oyster glycogen and 12 0 fil of 10 mM Tris- 

20 HC1 and 1 mM EDTA were added and the sample was phenol- 
chloroform extracted. The DNA was ethanol precipitated, 
centrifuged, and the DNA pellet washed with 8 0% ethanol. After 
air drying, the DNA was resuspended in 3 pel of water. Thirty 
seven microliters of electroporation competent DH10B cells 

25 (Gibco-BRL) was added to the DNA, and electroporation was 
completed with a Bio-Rad Gene Pulser (Model #1652076) and Bio- 
Rad Pulse Controller (Model #1652098) electroporation unit 
(Bio-Rad Laboratories, Richmond, CA) . Four milliliters of SOC 
(Hanahan, J. Mol. Biol. 166 (1983), 557-580) was added to the 

30 electroporated cells, and 400 fil of the cell suspension was 
spread on each of ten 150 mm LB amipicillin plates. After an 
overnight incubation, 10 ml of LB amp media was added to each 
plate, and the cells were scraped into the media. Clycerol 
stocks and plasmid preparations were made from each plate. The 

35 library background (vector without insert) was established at 
aproximately 1% by ligating the vector without insert and 
titering the number of clones after electroporation. 
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screening the cDNA library 

Full length cellulase cDNA clones were isolated from the 
Fusarium oxvsporum cDNA library by hybridization to PCR 
5 generated genomic oligonucleotide probes. 

The PCR-generated oligonucleotides: ZC33 09, a 4 0-mer coding for 
part of the C family cellobiohydrolase, ATT ACC AAC ACC AGC GTT 
GAC ATC ACT GTC AGA GGG CTT C; ZC3 310, a 28-mer coding for the 

10 C family endoglucanase, AAC TCC GTT GAT GAA AGG AGT GAC GTA.G; 
and ZC3311, a 4 0-mer coding for the F family cellulase, CGG AGA 
GCA GCA GGA ACA CCA GAG GCA GGG TTC CAG CCA C, were end labeled 
with T 4 polynucleotide kinase and 32 "~ p gamma ATP. For the 
kinase reaction 17 picomoles of each oligonucleotide were 

15 brought up to 12.5 /xl volume with dei onized water. To these 
. were added 2 fxl 10 X kinase buffer (1 X: 10 inM magnesium 
chloride, 0.1 mM EDTA, 50 mM Tris pH 7.8), 0.5 fil 200 mM 
dithiothreitol, 1 fil 32 P gamma ATP 150 mCi/ml, Amersham) , 2 £tl 
T 4 polynucleotide kinase (10 U/^tl BRL) . The samples were then 

20 mixed and incubated at 37 °C for 3 0 minutes. Oligonucleotides 
were separated from unincorporated nucleotides by precipitation 
with 180 fJLl TE (10 mM tris pH 8*0, 1 mM EDTA), 100 /il 7.5 M 
ammonium acetate, 2 £tl mussel glycogen (2 0 mg/ml, Gibco-BRL) 
and 750 /xl 100% ethanol . Pellets were dissolved in 200 /xl 

25 distilled water. To determine the amount of radioactivity 
incorporated in the oligonucleotides, 10 fil of 1:1000 dilutions 
of oligonucleotides were read without scintillation fluid in a 
Beckman LS 1800 Liguid Scintillation System. Activities were: 
115 million cpm for ZC3309, 86 million cpm for ZC3310, and 79 

30 million cpm for ZC3311. 

Initially, a library of 20,000 cDNA clones was probed with a 
mixture of each of the three oligonucleotides corresponding to 
the C family cellobiohydrolase, C family endoglucanase and F 
35 family cellulase clones. The cDNA library was plated out from 
titered glycerol stocks stored at -70 °C. Four thousand clones 
were plated out on each of five 150 mm LB ampicillin (1000 
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£ig/ml) plates. Lifts were taken in duplicate following standard 
methodology Sambrook et al., Molecular Cloning . 1989) using 
Biotrans 0.2 /tm 137 mm filters. The filters were baked at 80 "c 
in vacuum for 2 hours, then swirled overnight in a 
5 crystallizing dish (Pharmacia LKB Biotechnology, Alameda, CA) 
at 37'C in 80 ml prehybridization solution (5 X Denhardt ' s (IX: 
0.02% Ficoll, 0.02% polyvinylpyrrolidone, 0.02% bovine serum 
albumen Pentax Fraction 5 (Sigma, St. Louis, MO)) 5 X SSC (l X: 
0.15 M sodium chloride, 0.15 M sodium citrate pH 7.3)), 100 
10 /ig/ml denatured sonicated salmon sperm DNA, 50 mM sodium 
phosphate pH 6.8, 1 mM sodium pyrophosphate, 100 £iM ATP, 20% 
fonnamide, 1% sodium dodecyl sulfate) (Ulrich et al. EMBO J. 3 
(1984), 3 61-3 64)". 

15 Prehybridized filters were probed by adding them one at a time 
into a crystallizing dish with 8 0 ml prehybridization solution 
with 80 million cpm ZC3309, 86 million cmp ZC3310 and 79 
million cpm ZC3311 and then swirled overnight at 37*c. Filters 
were then washed to high stringency. The probed filters were 

2 0 washed with three 400 ml volumes of low stringency wash 

solution (2 X SSC, 0.1% SDS) at room temperature in the 
crystallizing dish, then with four 1-liter volumes in a plastic 
box. A further wash for 2 0 minutes at 68 'C with 
tetramethyl ammonium chloride wash solution (TMACL: 3 m 
25 tetramethyl ammonium chloride, 50 mM Tris-HCl pH 8.0, 2 mM EDTA, 
1 g/1 SDS) (Wood et al., Proc. Natl. Acad. Sci. 82 (1985)) 
provided a high stringency wash for the 28-mer ZC3310 
independent of its base composition 1585-1588) . The filters 
were then blotted dry, mounted on Whatman 3 MM paper and covered 

3 0 with plastic wrap for autoradiography. They were exposed 

overnight at -70 'C with intensifying screens and Kodak XAR-5 
film. 

Two putative positives appeared on duplicate filters. The 
35 corresponding areas on the plates with colonies were picked 
into l ml of IX polymerase chain reaction (PCR) buffer (100 mM 
Tris HC1 pH 8.3, 500 mM KC1, 15 mM MgCl , 0.1% gelatin; Perkin 
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Elmer Cetus) and plated out at five tenfold dilutions on 100 mm 
LB plates with 70 ^tg/ml ampicillin. These plates were grown at 
37 'C overnight. Two dilutions of each putative clone were 
chosen for rescreening as outlined above. One isolated clone, 
5 pZFH196 was found. This was grown up overnight in 10 ml 2X YT 
broth (per liter: 16 g bacto- ! tryptone , 10 g bacto-yeast 
extract, 10 g NaCl) . Twenty three micrograms of DNA were 
purfified by the rapid boiling method (Holmes and Quigley, 
Anal. Biochem. 114 (1981) , 193-197). From restriction analysis 
10 the clone was found to be approximately 2,000 base pairs in 
length. Sequence analysis showed it to contain a fragment 
homologous to the C family cellobiohydrolase fragment cloned by 
PCR. •• - 



a 



.15 In an attempt to isolate additional cellulase cDNA clones, 

cDNA library of 2 million clones was plated out on 20 150 mm LB 
plates (100 /ig/ml ampicillin) containing approximately 100,000 
cDNA clones. Lifts were taken in duplicate as in the first 
screening attempt. This library was screened with 
2 0 oligonucleotides corresponding to the three cellulase species 
as described above except that the hybridization was carried 
out with formamide in the prehybridization buffer and at a 
temperature of 30 "C. Washing with TMACL was carried out twice 
for 20 minutes at 67 "C. Between 8 and 20 signals were found on 
25 duplicate filters of each of the 20 plates. Fifteen plugs were 
taken from the first plate with the large end of a pasteur 
pipet.into l ml l X PCR buffer (Perkin-Elmer Cetus). pcr was 
carried out on the bacterial plugs with three separate 
oligonucleotide mixtures. Each mixture contained the vector 
30 specific oligonucleotide ZC2847 and additionally, a different 
cellulase specific oligonucleotide (ZC3309, ZC3310 or ZC3311) 
within each mixture. Amplitag polymerase (Perkin-Elmer Cetus) 
was used with Pharmacia Ultrapure dNTP and following Perkin 
Elmer Cetus procedures. Sixteen picomoles of each primer were 
35 used in 4 0 fil reaction volumes. Twenty microliters of cells in 
1 X PCR buffer were added to 20 fil mastermix which contained 
everything needed for PCR except for DNA. After an initial l 
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minute 4 5 second denaturation at 94 'C 2 8 cycles of: 4 5 seconds 
at 94°C, l minute at 45°C and 2 minutes at 72"C with a final 
extension of 10 minutes at 72 »C were employed in a Perkin Elmer 
thermocycler. Ten of the 15 plugs yielded a band when primed 
5 with the C family specific oligonucleotide 2C3309 and ZC2847. 
The other mixtures gave no specific 1 products. Five plugs which 
produced the largest bands by PCR, therefore possibly being 
full length C family cellobiohydrolases, along with the 5 plugs 
which did not produce PCR bands, were plated, out at five io 
10 fold dilutions onto 100 mm LB plates with 70 /ig/ml ampicillin 
and grown overnight. Duplicate lifts were taken of two ten fold 
dilutions each. Prehydridization and hybridization were carried 
out as described above with a mixture of the 3 
oligonucleotides. Isolated clones were found on all 10 of the 
15 platings. These were picked from the dilution plates with a 
toothpick for single colony isolation on 100 mm LB plates with 
70 Atg/ml ampicillin. PCR was carried out on isolated bacterial 
colonies with 2 oligonucleotides specific for the c family 
cellobiohydrolase (ZC34 09 (CCG TTC TGG ACG TAC AG A) and 2C3 411 
20 (TGA TGT CAA GTT CAT CAA)). Conditions were identical to those 
described above except for using 10 picomoles of each primer in 
25 a*1 reaction volumes. Colonies were added by toothpick into 
PCR tubes with 25 pi mastermix before cycling. Five of the 10 
gave strong bands of the size expected for a C family 
25 cellobiohydrolase. Isolated colonies were then grown up in 20 
ml of Terrific Broth (Sambrook et al., op^_ cit. . A2) and DNA 
was isolated by the rapid boiling method. The clones were 
partially sequenced by Sanger dideoxy sequencing. From sequence 
analysis the 5 clones which did not give bands specific for a 
30 C family cellobiohydrolase by PCR were shown to be F family 
cellulase clones. 

In order to clone the c family endoglucanase, the cDNA library 
of 2 million clones was rescreened with only ZC3310. Conditions 
35 of prehydridization and hybridization were like those used 
above. Filters were hybridized for 10 hours at 30 °C with one 
million CPM endlabeled ZC3 310 per ml prehybridization solution 
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without formamide. Washing with TMACL was carried out 2 times 
for .20 minutes at 60°C. Seven weak signals were found on 
duplicate filters. Plugs were picked with the large end of a 
pipet into 1 ml LB broth. These were each plated out in 5 10 
5 fold dilutions on 100 mm LB plates with 70 fig/ral ampicillin. 
Duplicate lifts were taken of 2 ' dilutions each and were 
processed as described above. Prehybridization, hybridization, 
and washing were, carried out as for the first level of 
screening. Three isolated clones were identified and streaked 

10 out for single colony hybridization. Isolates were grown 
overnight in 50 ml of Terrific Broth (per liter: 12 g tryptone, 
24 g yeast extract, 4 ml glycerol, autoclaved^ and 100 ml of 
0.17 M KH 2 P0 4 , 6.72 M K 2 HP0 4 (Sambrook et al., op. cit. . A2) 
and DNA was isolated by alkaline lysis and PEG precipitation by 

15 standard methods (Maniatis 1989, 1.38-1.41). From restriction 
analysis, one clone (pZFH223) was longer than the others and 
was chosen for complete sequencing. Sequence analysis showed it 
to contain the PCR fragment cloned initially. 

2 0 DNA sequence analysis 

The cDNAs were sequenced in the yeast expression vector 
pYCDE8 1 . The dideoxy chain termination method (F. Sanger et 
al., Proc. Natl. Acad. Sci. USA 74 , 1977, pp. 5463-5467) using 

25 §35-S dATP from New England Nuclear (cf. M.D. Biggin et al., 
Proc. Natl. Acad. Sci. USA 80 , 1983, pp. 3963-3965) was used 
for all' sequencing reactions. The reactions were catalysed by 
modified t7 DNA polymerase from Pharmacia (cf. S. Tabor and 
C.C. Richardson, Proc. Natl. Acad. Sci. USA 84, 1987, pp. 4767- 

30 4771) and were primed with an oligonucleotide complementary to 
the ADH1 promoter (2C996: ATT GTT CTC GTT CCC TTT CTT) , 
complementary to the CYC1 terminator (ZC3635: TGT ACG CAT GTA 
ACA TTA) or with oligonucleotides complementary to the DNA of 
interest. Double stranded templates were denatured with NaOH 

35 (E. Y. Chen and P,H. Seeburg, DNA 4, 1985, pp. 165-170) prior to 
hybridizing with a sequencing oligonucleotide. Oligonucleotides 
were synthesized on an Applied Biosystems Model 3 8 OA DNA 
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synthesizer. The oligonucleotides used for the sequencing 
reactions are listed in the sequencing oligonucleotide table 
below: 



5 C-familv cellobiohvdrolase sequencing primers 

f 

ZC3411 TGA TGT CAA GTT CAT CAA 

ZC34 08 TCT GTA CGT CCA GAA CGG 

ZC34 07 ATG ACT TCT CTA AG A AGG 

ZC3406 TCC AAC ATC AAG TTC GGT 

10 ZC3410 AGG CCA ACT CCA TCT GAA 

ZC3 3 09 ATT ACC AAC ACC AGC GTT GAC ATC ACT GTC AGA GGG CTC 
C 

ZC34 09 CCG TTC TGG ACG TAC AGA 

15 F-family cellulase specific sequencing primers 

ZC3413 CCA TCG ACG GTA TTG GAT 

ZC3 311 CGG AGA GCA GCA GGA ACA CCA GAG GCA GGG TTC CAG CCA 
C 

ZC3412 GAG GGT AGA GCG ATC GTT 

20 

C-familv endoglucanase specific sequencing primers 

ZC37 3 9 TGA TCT CAT CGA GCT GCA CC 

ZC3 684 GTG ATG CTC AGT GCT ACG TC 

ZC3310 AAC TCC GTT GAT GAA AGG AGT GAC GTA G 

25 ZC3750 TCC AAT AGC TTC CCA GCA AG 

ZC3683 TGT CCC TTG ATG TTG CCA AC 



The DNA sequences of the full-length cDNA clones, as well as 
the derived amino acid sequences, are shown in the appended 
30 Figs, 11 (C-family cellobiohydrolase) , 12 (F-family cellulase) 
and 13 (C-family endoglucanase) • 
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Example 4 

Isolation of endoalucanase EGI gene from H. insolens 



The cDNA library described in example 1 was also screened with 
5 a 35 bp oligonucleotide probe in the antisense configuration 
with the sequence: 

NOR-770: 5 f GCTTCGCCCATGCCTTGGGTGGCGCCGAGTTCCAT 3 1 

The sequence was derived from the amino acid sequence of an 
10 alcalase fragment of EGI purified from H. insolens , using our 
knowledge of codon bias in this organism. Complete clones of 
1.6 kb contained the entire coding sequence of 1.3 kb as shown 
in Fig. 14A-E. The probe sequence NOR-770 is located at Met 344 ~ 
Ala 355 . 

15 

Construction of expression plasmids of EGI ffull length) and 
EGI' (truncated) 

The EGI gene still containing the poly-A tail was inserted into 
20 an A. oryzae expression plasmid as outlined in Fig 2. The 
coding region of EGI was cut out from the Ncol-site in the 
initiating Met-codon to the Bam Hl-site downstream of the poly- 
A region as a 1450 bp fragment from pHW480. This was ligated to 
a 3.6 kb Ncol-Narl fragment from pSX22 4 (Fig. 1) containing the 
25 TAKA promoter and most of pUC19, and to a 960 bp Narl-BamHI 
fragment containing the remaining part with the AMG-terminator . 
The 960 bp fragment was taken from p960 which is equivalent to 
p777 (described in EP 238,023) except for the inserted gene. 
The resulting expression plasmid is termed pHW485. 

30 

The expression plasmid pHW704 with the full length EGI gene 
without poly A tail is shown in Fig. 3. From the BstEII site 
1300 bp downstream of the Ncol-site was inserted a 102 bp 
BstEII-BamHI linker (2645/2646) ligated to Bglll-site in the 
35 vector. The linker contains the coding region downstream of 
BstEII-site with 2 stop codons at the end and a Pvul-site near 
the C-terminal to be used for addition of CBD and B-regions. 
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Expression plasmid pHW697 with the truncated EGI 1 gene was 
constructed similarly using a BstEII-BamHI linker (2492/2493) 
of 69 bp. In this linker we introduced a Pstl-site altering 
Val 421 to Leu 421 and the last 13 amino acids of the coding 
5 region: K 423 PKPKPGHGPRSD 435 were, eliminated. The short tail 
with the rather unusual sequence was cut off to give EGI 1 a C- 
terminal corresponding to the one found in T. reesei EGI just 
upstream of the A and B-region. 

10 

Construction of an expression plasmid of EGI* with CBD and B 
region from a - 43 kD endoglucanase added C-terminallv 

The - 4 3 kD endoglucanase of H. isolens described in DK patent 
15 application No. 73 6/91 has shown good washing performance. 
Besides the catalytic domain, 43 kD cellulase has a C-terminal 
CBD and B region which has been transferred to EGI 1 which does 
not have any CBD or B region itself. The construction was done 
in 2 steps, as outlined in Fig. 4. The Pstl-HincII linker 
20 (028/030 M) intended to connect the C-terminal of EGI 1 to the 
B-region of 4 3 kD cellulase, was subcloned in pUC19 Pstl-EcoRI 
with C-terminal Hinc2-EcoRI 100 bp fragment from 4 3 kD 
cellulase gene in pSX320 (Fig 5; as described in DK 736/91). 
From the subclone pHW767 the CBD and B-region was cut out as a 
25 250 bp Pstl-Bglll fragment and ligated to pHW485 (Fig. 2) 
BstEII-Bglll fragment of 5.7 kb and to the remaining BstEII- 
Pstl fragment of 55 bp from pHW697 (Fig. 3) . The resulting 
expression plasmid pHW768 has the ~ 43 kD endoglucanase CBD and 
B region added to Gln 422 of EG * 1 • 

30 

Construction of an expression plasmid of EGI with the CBD and 
B region from - 43 kD endoglucanase added C-terminally 

35 This plasmid was constructed in a similar way as pHW768 except 
that, in this case, the C-terminal linker yielded the complete 
sequence of EGI. Fig. 6 shows the procedure in 3 steps/ The 
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Pvul-Hincll linker (040 M/041 M) was subcloned in pUCl8 to give 
PHW775, into which a HincII-EcoRI 1000 bp fragment from pSX 320 
(Fig. 5) was inserted to give pHW776. From this the CBD and B 
region was cut out as a 250 bp Pvul-Bglll fragment and ligated 
5 to 5.7 kb BstEII-Bglll fragment from pHW485 (Fig. 2) and 90 bp 
BstEII-Pvul fragment from pHW704 (Fig. 3). The resulting 
expression plasmid pHW777 contains the - 4 3 kD endoglucanase 
CBD and B region added to Asp 435 in the complete EGI sequence. 



10 



Expression in A. orvzae of E GI and EGI ' with and without the. 
CBD and B reg ion from ~ 4 3 kD endoglucanase 

The expression plasmids pHW485, pHW704, pHW697, pHW768 and 
15 pHW777 were transformed into A. orvzae IFO .4177 as described in 
example 2. Supernatants from transf ormants grown in YPD medium 
as described were analyzed by SDS-PAGE, where the native EGI 
has an apparent Mw of 53 kD. EGI- looks slightly smaller as 
expected, and the species with the added CBD and B region are 
20 increased in molecular weight corresponding to the size of the 
CBD and B region with some carbohydrate added. A polyclonal 
antibody AS169 raised against the - 43 kD endoglucanase 
recognizes EGI and EGI' only when the - 43 kD CBD and B region 
are added, while all 4 species are recognized by a polyclonal 
25 antibody AS78 raised against a cellulase preparation from 
insolens . All 4 species have endoglucanase activity as measured 
on -soluble cellulose in the form of carboxy methyl cellulose. 



30 



35 



Linkers 



2492/2493 ; BstE2-Pstl-BanHl 

5 ' GTCACCTACACCAACCTCCGCTGGGGCGAG 
3 ' GATGTGGTTGGAGGCGACCCCGCTC 

ATCGGCTCGACCTACCAGGAGCTGCAGTAGTAA 
TAGCCGAGCTGGATGGTCCTCGACGTCATCATT 



TGATAG 3 • 69 bp 

4 0 ACTATCCTAG 5« 68 bp 
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2645/2646 : BstE2-Xmal-PvuI-BamHl 

5 1 GTCACCTACACCAACCTCCGCTGGGGCGAGATCGGC 
3 1 GATGTGGTTGGAGGCGACCCCGCTCTAGCCG 

TCGACCTACCAGGAGGTTCAGAAGCCTAAGCCCAAG 
AGCTGGATGGTCCTCCAAGTCTTCGGATTCGGGTTC 



CCCGGGCACGGCCCCCGATCGGACTAATAG 3 1 102 bp 

10 GGGCCCGTGCCGGGGGCTAGCCTGATTATCCTAG 5' 101 bp 

028 M/030 M e Pstl-HincII 

15 5' GTCCAGCAGCACCAGCTCTCCGGTC 3* 2 5 bp 

3' ACGTCAGGTCGTCGTGGTCGAGAGGCCAG 5' 29 bp 



20 



040 M/041 M : PvuI-HincII 

5' CGTCCAGCAGCACCAGCTCTCCGGTC 3* 2 6 bp 

3' TAGCAGGTCGTCGTGGTCGAGAGGCCAG 5' 2 8 bp 



25 Example 5 

- 43 kD endoglucanase with different CBDs and B-regions: 

In order to test the influence oh the - 43 kD endoglucanase of 
30 the different CBDs and B regions from the A region clones we 
have substituted the original CBD and B region from - 43 kD 
with the other C-terminal CBDs and B regions, i.e. A-l, A-8, A- 
9, A-ll, and A-19 (cf. Example 1) . In order to test the 
concept we have also made a construction where the 43 kD B 
35 region has been deleted. 



Fragments: 

40 All fragments were made by PCR amplification using a Perkin- 
Elmer/Cetus DNA Amplification System following the 
manufacturers instructions . 



SUBSTITUTE SHEET 



WO 91/17244 



PCT/DK91/00124 



35 

1) A PCR fragment was made which covers the DNA from 56 
bp upstream of the Bam HI site on pSX 3 20 (Fig. 5) to 717 bp 
within the coding region of the -4 3 kD endoglucanase gene and 
at the same time introduces a Kpn I site at pos. 7 08 and a Sma 

5 I site at pos, 702 in the coding region which is at the very 
beginning of the B region. This PCR fragment was made with the 
primers NOR 1542 and NOR 3 010 (see list of oligonucleotides 
below) . 

10 x 

2) A PCR fragment was made which includes the CBD and B 
region of A-l introducing a Kpn I site at the very beginning of 
the B region in frame with the Kpn I site introduced in 1) and 
introducing a Xho I site downstream of the coding region of the 

15 gene. Primers used: NOR 3 012 upstream and NOR 3 011 downstream. 

3) As 2) except that the fragment covered the CBD and B 
region of A-8 and the Xho I site in the expression vector 
downstream of gene. Primers: NOR 3017 and NOR 2516. 

20 

4) As 2) but with primers NOR 3 016 and NOR 3015 
covering the CBD and B region from A-9. 

5) As 3) but with primers NOR 3021 and NOR 2516 covering 
25 the CBD and B region from A-ll. 

6) As 2) but with primers NOR 3 03 2 and NOR 3022 covering 
the CBD and B region from A- 19 . 

30 7) A PCR fragment which includes the CBD from ~ 43 kD 

endoglucanase and the Xho I site downstream from the gene on 
pSX 320 introducing a Pvu II site at the very end of the B 
region. 

Primers: NOR302 3 and NOR2516. 

35 

Combinations: 
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1) + 2) inserted as Bam HI - Kpn I and Kpn I - Xho I into pToC 
68 (described in DK736/91) Bam HI - Xho I, thus coding for the 
43 kD core enzyme with the CBD and B region from A-l. 

5 1) + 3) : Like above giving a 43 kD enzyme with the A-8 CBD/B 
region. 

1) + 4) : As above, but with the A-9 CBD and B region. 
10 1) + 5) : As above, but with the A-ll CBD and B region. 

1) + 6) : As above, but with the A-19 CBD and B region. 

15 1) + 7) inserted as Bam HI - Sma I and Pvu II - Xho I into pToC 
68 Bam HI - Xho I, thus coding for the 4 3 kD enzyme without the 
B region. 

Oligonucleotides: 

20 

CGACAACATCACATCAAGCTCTCC - 3 ' 
CCATCCTTTAACTATAGCGA - 3' 

GCTGGTGCT GGTACCCGGGA TCTGGACGGCAGGG - 3 • 
Kpn Sma 

GCAT CGGTACC GGCGGCGGCTCCACTGGCG - 3 • 
Kpn 

CTCACTCCA TCTCGAG TCTTTCAATTTACA - 3 • 
Xho 

CTTTTCTCGAGTCCCTTAGTTCAAGCACTGC - 3 » 
Xho 

TG AC CGGTACCGG CGG CGG C AACA CCAACC - 3' 
Kpn 

TCACCGGTACCGGCGGTGGAAGCAACAATG - 3 ' 
Kpn 

TCTTC GGTACC AGCGGCAACAGCGGCGGCG - 3' ......... 

Kpn 

45 



NOR 1542; 5» - 

NOR 2516: 5' - 

25 NOR 3 010: 5' - 

NOR 3 011: 5» - 

30 

NOR 3 012: 5« - 

NOR 3 015: 5» - 

35 

NOR 3 016: 5 1 - 

40 NOR 3017: 5» - 

NOR 3021: 5' - 
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NOR 3 022: 5' - CGCTGGGTACCAACAACAATCCTCAGCAGG -3' 

Kpn 

NOR 3 023: 5 f - CTCCCAGCAGCTGCACTGCTGAGAGGTGGG - 3 1 
5 PVU II 

NOR 3 032: 5 1 - CGGCCTCGAGACCTTACAGGCACTGCGAGT - 3 1 

Xho i 

10 

Example 6 

Fusion of a bacterial catalytic domain to a fungal CBD 

15 The endoglucanase Endo 1 produced by Bacillus lautus NCIMB 
40250 (described in PCT/DK9 1/00013) consists of a catalytic 
domain (core) (Ala(32) - Val(555)) and a C terminal cellulose 
binding domain (CBD) (Gln556 - Pro7 00) homologous to the CBD of 
a subtilis endoglucanase (R.M. MacKay et al. 1986. Nucleic 

20 Acids Res. 14., 9159-70). The CBD is proteolytically cleaved off 
when the enzyme is expressed in B^. subtilis or E_i_ coli 
generating a CMC degrading core enzyme. In this example this 
core protein was fused with the B region and CBD of the - 43 kD 
endoglucanase from Humicola insolens (described in DK 736/91) . 



25 



Construction of the fusion, 



The plasmid pCaHj 170 containing the cDNA gene encoding the - 
4 3 kD endoglucanase was constructed as shown in Fig. 7. pCaHj 

30 170 was digested with Xho II and Sal I. The 223 bp Xho II - Sal 
I fragment was isolated and ligated into pUC 19 (Yanisch-Perron 
et al. 1985. Gene 33., 103-119) digested with BamH I and Sal I. 
The BamH I site was regenerated by this Xho II-BamH I ligation. 
The resulting plasmid, IM 2, was digested with Eco Rl and BamH 

3 5 I and ligated with the linker NOR 304 5 - NOR 3 046: 

NOR 3 045 5 1 AATTCCGCGGAACGATATCTCCGA 3 1 

NOR 3046 3 1 GGCGCCTTGCTATAGAGGCTCTAG 5 1 

EcoR I EcoR V Mbo I 

40 Sac II 
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The resulting plasmid, IM 3, was digested with EcoR V and SacII 
and ligated to the 445 bp Hinc II - Sac II p PL 517 fragment. 
PPL 517 contains the entire Bacillus Endo i gene 
(PCT/DK9 1/00013) . The product of this ligation was termed IM 4. 
5 In order to replace the Bacillus signal peptide of Endo 1 with 
the fungal signal peptide from the 43 kdal endoglucanase four 
PCR primers were designed for "Splicing by Overlap Extension" 
(SOE) fusion (R M Horton et al. (1989).:Gene, 77, 61-68). The 43 
kD signal sequence was amplified from the plasmid pCaHj 109 (DK 
10 736/91) introducing a Bel I site in the 5 • end and a 21 bp 
homology to the Bacillus endo 1 gene in the 3' end using the 5- 
primer NOR 3270 and the 3- primer NOR 3275. The part of the 
Endo I gene 5' to the unique Sac II site was amplified using 
■ the 5' primer NOR 3276 introducing a 21 bp homology to the 43 
15 kdal gene and the 3- primer NOR 3271 covering the Sac II site. 
The two PCR f raments were mixed, melted, annealed and filled up 
with the tag polymerase (Fig. 9) . The resulting hybrid was 
amplified, using the primers NOR 3270 and NOR 3271. The hybrid 
fragment was digested with Bel 1 and SacII and ligated to the 
20 676 bp* Sac II - sal I fragment from IM 4 and the AsperaiUn. 
expression vector pToc 68 (DK 736/91) digested with BamH I. The 
product of this ligation, pCaHj 180 (Fig. 10), contained an 
open reading frame encoding the 43 kD signal peptide and the 
first four N terminal aminoacids of the mature ~ 43 kD 
25 endoglucanase (Met (1) -Arg(25) fused to the core of Endo 3 
(Ser(34)-Val(549)) followed by the peptide Ile-Ser-Glu (encoded 
by the linker) fused to the 43 kD B region and CBD (lle(233)- 
Leu(285). pcaHj 180 was used to transform Aspermli,,. orV2ae 
IFO 4177 using selection on acetamide by cotransf ormation with 
30 pToc 90 (cf. DK 736/91) as described in published EP patent 
application No. 238 023. 

NOR 327 0 5' TTG AATTCTG ATCAAGATG CGTTC CT CC C 3' 
NOR 3275 5' AATGGTGAAAGTGACATCACTCCTGCCATCAGCGGCAAGGGC 3' 
35. NOR 3276 5» GCCCTTGCCGCTGATGGCAGGAGTGATGTCACTTTCACCATT 3' 
NOR 3271 5' AGCGCGTCCGCGGTAGCTATG 3' 
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The sequence of the Endo 1 core and the - 43 kD CBD and B 
region is shown in the appended Fig. 15A-D. 
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CLAIMS 

1. A cellulose- or hemicellulose-degrading enzyme which is 
derivable from a fungus other than Trichoderma or 
5 Phanerochaete r and which comprises a carbohydrate binding 
domain homologous to a terminal A region of Trichoderma reesei 
cellulases, which carbohydrate binding domain comprises the 
following amino acid sequence 

10 1 10 

Xaa Xaa Gin Cys Gly Gly Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa Cys Xaa 



15 



20 



20 30 
Xaa Xaa Xaa Xaa -Cys Xaa Xaa Xaa Asn Xaa Xaa Tyr Xaa "Gin Cys Xaa 

Xaa 

or a subsequence thereof capable of effecting binding of the 
enzyme to an insoluble cellulosic or hemicellulosic substrate. 

2, An enzyme according to claim 1, which is derivable from a 
strain of Humicola . Fusarium or Myceliopthora > 

25 3. An enzyme according to claim 1, wherein the variations in 
the amino acid sequence shown in claim 1 are selected as 
follows 

in position 1, the amino acid is Trp or Tyr; 

in position 2, the amino acid is Gly or Ala; 
3 0 in position 7, the amino acid is Gin, lie or Asn; 

in position 8, the amino acid is Gly or Asn; 

in position 9, the amino acid is Trp, Phe or Tyr; 

in position 10, the amino acid is Ser, Asn, Thr or Gin; 

in position 12, the amino acid is Pro, Ala or Cys; 
35 in position 13, the amino acid is Thr, Arg or Lys; 

in position 14, the amino acid is Thr, Cys or Asn; 

in position 18, the amino acid is Gly or Pro; 

in position 19, the amino acid (if present) is Ser, Thr, Phe, 
Leu or Ala; 

40 in position 20, the amino acid is Thr or Lys: 
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in position 24, the amino acid is Gin or lie; 
in position 26, the amino acid is Gin, Asp or Ala; 
in position 27, the amino acid is Trp, Phe or Tyr; 
in position 29, the amino acid is Ser, His or Ala; and/or 
5 in position 32, the amino acid is Leu, lie, Gin, Val or Thr. 

4. An enzyme according to claim 3, wherein the carbohydrate 
binding domain comprises the following amino acid sequence 

10 Trp Gly Gin Cys Gly Gly Gin Gly Trp Asn Gly Pro Thr Cys Cys Glu 
Ala Gly Thr Thr Cys Arg Gin Gin Asn Gin Trp Tyr Ser Gin Cys 
Leu; 

Trp Gly Gin Cys Gly Gly lie Gly Trp Asn. Gly Pro Thr Thr Cys Val 
15 Ser Gly Ala Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
Leu ; 

Trp Gly Gin Cys Gly Gly lie Gly Phe Asn Gly Pro Thr Cys Cys Gin 
Ser' Gly Ser Thr Cys Val Lys Gin Asn Asp Trp Tyr Ser Gin Cys 
20 Leu; 

Trp Gly Gin Cys Gly Gly Asn Gly Tyr Ser Gly Pro Thr Thr Cys Ala 
Glu Gly - Thr Cys Lys Lys Gin Asn Asp Trp Tyr Ser Gin Cys Thr 
Pro; 

25 

Trp Gly Gin Cys Gly Gly Gin Gly Trp Gin Gly Pro Thr Cys Cys Ser 
Gin Gly - Thr Cys Arg Ala Gin Asn Gin Trp Tyr Ser Gin Cys Leu 
Asn; 

30 Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Thr Asn Cys Glu 
Ala Gly Ser Thr Cys Arg Gin Gin Asn Ala Tyr Tyr Ser Gin Cys 
lie; 

Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Arg Asn Cys Glu 
Ser Gly ser Thr Cys Arg Ala Gin Asn Asp Trp Tyr Ser Gin Cys 
35 Leu; 
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Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys Val 
Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
Leu; 

5 Trp Gly Gin Cys Gly Gly Gin Asn Ty^ Ser Gly Pro Thr Thr Cys Lys 
Ser Pro Phe Thr Cys Lys Lys lie Asn Asp Phe Tyr Ser Gin Cys 
Gin; or 

Trp Gly Gin Cys Gly Gly Asn Gly Trp Thr Gly Ala Thr Thr Cys Ala 
10 Ser Gly Leu Lys Cys Glu Lys lie Asn Asp Trp Tyr Tyr Gin Cys Val 

5. An enzyme according to any of claims 1-4, which further 
comprises an amino acid sequence which defines a linking B 
region connecting the carbohydrate binding domain to the 

15 catalytically active domain of the enzyme. 

6. An enzyme according to claim 5, wherein the linking B region 
is one which is enriched in the amino acids glycine and/or 
asparagine and/or proline and/or serine and/or threonine and/or 

2 0 glutamine. 

7. An enzyme according to claim 6, wherein one or more of said 
amino acids appear in short, repetitive units* 

25 8. An enzyme according to any of claims 1-7, which comprises a 
carbohydrate binding domain derived from one naturally 
occurring cellulose- or hemicellulose-degrading enzyme, an 
amino acid sequence defining a linking B region, which amino 
acid sequence is derived from another naturally occurring 

30 cellulose- or hemicellulose-degrading enzyme, as well as a 
catalytically active domain derived from the enzyme supplying 
either the carbohydrate binding domain or B region or from a 
third enzyme. 

35 9. An enzyme according to claim 8, wherein the catalytically 
active domain is derived from an enzyme which does not, in 
nature, comprise a carbohydrate binding domain or B region. 
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10. An enzyme according to any of claims 1-9 which is a 
cellulase, e.g. an endoglucanase, cellobiohydrolase or p- 
glucosidase. 

5 11. A DNA construct which comprises a DNA sequence encoding an 
enzyme according to any of claims 'l-io. 

12. An expression vector which carries an inserted DNA 
construct according to claim 11. 



10 



13. A cell which is transformed with a DNA construct according 
to claim 11 or with an expression vector according to claim 12-. 

14. A cell according to claim 13 which is a fungal cell, e.g. 
15 belonging to a strain of Aspergillus, e.g. Aspergillus niaer or 

Aspergillus oryzae , or a yeast cell, e.g. belonging to a strain 
of SaccharoTtiyces , such as Saccharomy css cerevisiae . 

15. A method of producing an enzyme according to any of claims 
20 l-io, wherein a cell according to claim 13 or 14 is cultured 

under conditions conducive to the production of the enzyme, and 
the enzyme is subsequently recovered from the culture. 

16. An agent for degrading cellulose or hemicellulose, the 
25 agent comprising an enzyme according to any of claims l-io. 

17. An agent according to claim 16 comprising a combination of 
two or more enzymes according to any of claims l-io, or a 
combination of one or more enzymes according to any of claims 

3 0 l-io with one or more other enzymes with cellulose- or 
hemicellulose-degrading activity. 

18. A carbohydrate binding domain homologous to a terminal A 
region of Trichoderma reesei cellulases, which carbohydrate 

3 5 binding domain comprises the following amino acid sequence 
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1 10 

Xaa Xaa Gin Cys Gly Gly Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa Cys Xaa 

20 30 
5 Xaa Xaa Xaa Xaa Cys Xaa Xaa Xaa Asn Xaa Xaa Tyr Xaa Gin Cys Xaa 

Xaa 

10 or a subsequence thereof capable of effecting binding of a 
protein to an insoluble cellulosic or hemicellulosic substrate. 

19. A carbohydrate binding domain according to claim 18, 
wherein the variations in the amino acid sequence shown in 
15 claim 18 are selected as follows 





in 


position 


l, 


the 


amino 


acid 


is Trp or Tyr; 






in 


position 


2, 


the 


amino 


acid 


is Gly or Ala; 






in 


position 


7 , 


the 


amino 


acid 


is Gin, lie or Asn; 




20 


in 


position 


8, 


the 


amino 


acid 


is Gly or Asn; 






in 


position 


9, 


the 


amino 


acid 


is Trp, Phe or Tyr; 






in 


position 


io, 


the 


amino 


acid 


is Ser, Asn, Thr or 


Gin; 




in 


position 


12, 


the 


amino 


acid 


is Pro, Ala or Cys; 






in 


position 


13, 


the 


amino 


acid 


is Thr, Arg or Lys; 




25 


in 


position 


14, 


the 


amino 


acid 


is Thr, Cys or Asn; 






in 


position 


18, 


the 


amino 


acid 


is Gly or Pro; 






in 


position 


19, 


the 


amino 


acid 


(if present) is Ser 


, Thr, Phe, 




Leu or Ala; 
















in 


position 


20, 


the 


amino 


acid 


is Thr or Lys: 




30 


in 


position 


24, 


the 


amino 


acid 


is Gin or lie ; 






in 


position 


26, 


. the 


amino 


acid 


is Gin, Asp or Ala; 






in 


position 


27, 


the 


amino 


acid 


is Trp, Phe or Tyr; 






in 


position 


29, 


the 


amino 


acid 


is Ser, His or Tyr; 


and/or 




in 


position 


32, 


the 


amino 


acid 


is Leu, lie, Gin, Val or Thr. 



35 

20. A carbohydrate binding domain according to claim* 19, 
wherein the carbohydrate binding domain comprises the following 
amino acid sequence 
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Trp Gly Gin Cys Gly Gly Gin Gly Trp Asn Gly Pro Thr Cys Cys Glu 
Ala Gly Thr Thr Cys Arg Gin Gin Asn Gin Trp Tyr Ser Gin Cys 
Leu; 

5 Trp Gly Gin Cys Gly Gly lie Gly Trp Asn Gly Pro Thr Thr Cys Val 
Ser Gly Ala Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
Leu; 

Trp Gly Gin Cys Gly Gly lie Gly Phe Asn Gly Pro Thr Cys Cys Gin 
10 Ser Gly Ser Thr Cys Val Lys Gin Asn Asp Trp Tyr Ser Gin Cys 
Leu; 

Trp Gly Gin Cys Gly ' Gly Asn Gly Tyr Ser Gly Pro Thr Thr Cys Ala 
Glu Gly - Thr Cys Lys Lys Gin Asn Asp Trp Tyr Ser Gin Cys Thr 
15 Pro; 

Trp Gly Gin Cys Gly Gly Gin Gly Trp Gin Gly Pro Thr Cys Cys Ser 
Gin Gly - Thr Cys Arg Ala Gin Asn Gin Trp Tyr Ser Gin Cys Leu 
Asn; 

20 

Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Thr Asn Cys Glu 
Ala Gly Ser Thr Cys Arg Gin Gin Asn Ala Tyr Tyr Ser Gin Cys 
lie; 

Trp Gly Gin Cys Gly Gly Gin Gly Tyr Ser Gly Cys Arg Asn Cys Glu 
25 Ser Gly Ser Thr Cys Arg Ala Gin Asn Asp Trp Tyr Ser Gin Cys 
Leu; 

Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys Val 
Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
Leu ; 

30 

Trp Gly Gin Cys Gly Gly Gin Asn Tyr Ser Gly Pro Thr Thr Cys Lys 
Ser Pro Phe Thr Cys Lys Lys lie Asn Asp Phe Tyr Ser Gin Cys 
Gin; or 

35 Trp Gly Gin Cys Gly Gly Asn Gly Trp Thr Gly Ala Thr Thr Cys Ala 
Ser Gly Leu Lys Cys Glu Lys lie Asn Asp Trp Tyr Tyr Gin Cys Val 
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21. A linking B region derived from a cellulose- or 
hemicellulose-degrading enzyme, said region comprising an amino 
acid sequence enriched in the amino acids glycine and/or 
asparagine and/or proline and/or serine and/or threonine and/or 

5 glutamine. 

22. A B region according to claim 21, wherein one or more of 
said amino acids appear in short, repetitive units, 

10 23. A B region according to claim 21 or 22, which comprises the 
following amino acid sequence 

Ala Arg Thr , Asn Val Gly Gly Gly Ser Thr Gly Gly Gly Asn Asn Gly 
Gly Gly Asn Asn Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Pro 
15 Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Pro Gly Gly Asn Cys 
Ser Pro Leu; 

Pro Gly Gly Asn Asn Asn Asn Pro Pro Pro Ala Thr Thr Ser Gin Trp 
Thr Pro Pro Pro Ala Gin Thr Ser Ser Asn Pro Pro Pro Thr Gly Gly 

2 0 Gly Gly Gly Asn Thr Leu His Glu Lys; 

Gly Gly Ser Asn Asn Gly Gly Gly Asn Asn Asn Gly Gly Gly Asn Asn 
Asn Gly Gly Gly Gly Asn Asn Asn Gly Gly Gly Asn Asn Asn Gly Gly 
Gly Asn Thr Gly Gly Gly Ser Ala Pro Leu; 

25 

Val Phe Thr Cys Ser Gly Asn Ser Gly Gly Gly Ser Asn Pro Ser Asn 
Pro Asn Pro Pro Thr Pro Thr Thr Phe lie Thr Gin Val Pro Asn Pro 
Thr Pro Val Ser Pro Pro Thr Cys Thr Val Ala Lys; 

3 0 Pro Ala Leu Trp Pro Asn Asn Asn Pro Gin Gin Gly Asn Pro Asn Gin 

Gly Gly Asn Asn Gly Gly Gly Asn Gin Gly Gly Gly Asn Gly Gly Cys 
Thr Val Pro Lys; 

Pro Gly Ser Gin Val Thr Thr Ser Thr Thr Ser Ser Ser Ser Thr Thr 
35 Ser Arg Ala Thr Ser Thr Thr Ser Ala Gly Gly Val Thr Ser lie Thr 
Thr Ser Pro Thr Arg Thr Val Thr lie Pro Gly Gly Ala Ser Thr Thr 
Ala Ser Tyr Asn; 
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Glu Ser Gly Gly Gly Asn Thr Asn Pro Thr Asn Pro Thr Asn Pro Thr 
Asn Pro Thr Asn Pro Thr Asn Pro Trp Asn Pro Gly Asn Pro Thr Asn 
Pro Gly Asn Pro Gly Gly Gly Asn Gly Gly Asn Gly Gly Asn Cys Ser 
Pro Leu; or 

Pro Ala Val Gin lie Pro Ser Ser Ser Thr Ser Ser Pro Val Asn Gin 
Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr Ser Ser Pro Pro 
Val Gin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu Arg 
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Fig. 5 
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agaccggaattcgcggccgccatctatccaacggtctagcttcacttcacaatgtatcgc 

H Y R 

atcgtcgcaaccgcctcggctcttattgccgctgctcgggctcaacaggtctgctctttg 

IVATASALIAAARAQQVCSL 

aacaccgagaccaagcctgccttgacctggtccaagtgtacatccagcggctgcagcgat 

NT ETKP ALTWSKCTSSGCSD 

gtcaagggctccgttgttattgatgccaactggcgatggactcaccagacttctgggtct 

VK G SVV I DANWRWTHQTSGS 

accaactgttacaccggaaacaagtgggacacctccatctgcactgatggcaagacctgc 

TN CYTGNKWDTSI CTDGKTC 

GCCGAAAAGTGCTGTCTTGATGGCGCCGACTATTCTGGTACCTACGGAATCACCTCCAGC 

AEKCCLDGADYSGTYGITSS 

ggcaaccagctcagtcttggattcgtcaccaacggtccctacagcaagaacatcggcagc 

G NQL SLGFVTNGPYSKNIGS 

cgaacctacctcatggagaacgagaacaccatccagatgttccagcttctgggcaacgag 

RT Y LME N E N T Y QM F Q L LG N E 
ttcacctttgatgtcgatgtctctggtatcggctgcggtctgaacggtgcccctcacttc 

F T F D V- D- V S G I G CG LN GA P H F 
gtcagcatggacgaggatggtggcaaggccaagtactccggaaacaaggccggagccaag 

V S MDEDGGKAKYSGNKAG AK 
tacggaactggcTACtGTGATGccCAgTGCCCTCGTGATGTCAAGTTCATCAACGGAGTT 

V G TGYCDAQC PRDVKFI NGV 
GCCAACTCTGAGGGCTGGAAGCCCTCTGACAGTGATGTCAACGCtggtgttggtaatctg 

A N S EGW K P S D S D V N AGVG N L 

ggcacctgctgccccgagatggatatctgggaggccaactccatctccaccgccttcact 

GTCCPEMDIWEANSI S TAF T 

cctcatccttgcaccaagctcacacagcactcttgcactggcgactpttgtggtggaacc 

PH PCTKL TQHS CTGD SCGGT 

tactctagtgaccgatatggcggtacttgcgatgccgacggttgtgatttcaatgcctac 

YS SDRYGGTCDADGCDFNAY 

cgtcagggcaacaagaccttctacggtcctggatccaacttcaacatcgacaccaccaag 

RQGNKTFYGPGSNFNIDTTK 

aagatgactgttgtcactcagttccacaagggcAGCAAcGGACGTCTTTCTGAGATCACC 

KMTVVTQFHKGSNGRLSEIT 

CGTCTGTACGTCCAGAACGGCAAGGTCATTGCCAACTCAGAGTCCAAGATTGCAGGCAAC 

R L Y V Q N G K V I A N S E S K I A G N 
CCCGGTAGCTCTCTCACCTCTGACTTCTGCTCCAAGCAGAAGAGCGTCTTTGGCGATATC 
PG SSLTS DFCSKQKSVFGDI 
GATGACTTCTCTAAGAAGGGTGGCTGGAACGgCATGAGCGATGCTCTCTCTGCCCCTATG 
DD F SKKGGWNGMSDALSAPM 
GTTCTTGTTATGTCTCTCTGGCACGACCACCACTCCAAcATGCtcTGGCTgGACtctacc 

V L VMS LW.HDHH SNMLWLDST 
tacccaaccgactctaccaaggttggatctcaacgaggttcttgcgctaccacctctggc 
YPTDSTKVGSQRGSCATTSG 

^ aagccctccgaccttgagcgagatgttcccaactccaaggtttccttctccaacatcaAG 
KP SDLERDVPNSKVSFSNIK 
TTCGGTCCCATCGGAAGCACCTACAAGAGCGACGGCACCACCCCCAACCCCCCTgCCAGC 
FG PIGSTYKSDGTTPNPPAS 

AGCAGCACCACTGGTTCTTCCACTCCCACCAACCCCCCTGCCGGTAGCGTCGACCAATGG 
SS TTGSSTP. TNPPAGSVDQW 
GGACAgTGcGGTGGCCAgaactacagcggccccacgacctgcaagtctcctttcacctgc 
GQCGGQNYSGPTTCKSPFTC 
aagaagatcaacgacttctactcccagtgtcagtaaaggggctgccgagctatctagcat 
KK INDFYSQCQ . 

gagattgagaaacgatgtgatgagtggacgatcaaggagaagtgtgtggatgatatgaac 
ttgatgtgggaggac c-. n m 1 
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gaattcgcggccgcctgcttcgaagcatcagctcattgagatcagtcaaaatgcatacc 

M H T 

ctttcggttctcctcgctctcgctcccgtgtccgcccttgctcaggctcccatctgggga 

LSVLLALAPVSALAQAPIWG 

cagtgcggtggcaatggttggaccggtgctacaacctgcgctagtggtctgaagtgtgag 

QCGGNGWTGATT CASGLKCE 

aagatcaacgactggtactatcagtgtgttcctggatctggaggatctgaaccccagcct 

KINDWYYQ CVPGSGGSEPQP 

tcgtcaactcagggtggtggcactcctcagcctactggcggtaacagcggcggcactggt 

SSTQGGGT-PQPTGGNSGGTG 

ctcgacgccaaattcaaggccaagggcaagcagtactttggtaccgagattgaccactac 

LDAKFKAKGKQYFGTEI DHY 

caccttaacaacaatcctctgatcaacattgtcaaggcccagtttggccaagtgacatgc 

H LN NN P L I N I VKAQ F- G QVTC 

gagaacagcatgaagtgggatgccattgagccttcacgcaactccttcaccttcagtaac 

E N S M K W D A I E P S R N S F T F S N 

gctgacaaggtcgtcgacttcgccactcagaacggcaagctcatccgtgGCCACACTCTT 

ADKVVDFATQNGKLIRGHTL 

CTCTGGCACTCTCAGCTGCCTCAGTGGGTTCAGAACATCAACGATCGCTCTACCCTCACC 

LWHSQLPQWVQNINDRSTLT 

GCGGTCATCGAGAACCACGTCAAGACCATGGTCACCCGCTACAAGGGCAAGATCCTCCAG 

AVIENHVKTMVTR YKGKILQ 

TGGGACGTTGTCAACAACGAGATCTTCGCTGAGGACGGTAACCTCCGCGACAGTGTCTTC 

WDVVNNE I FAEDGNLRDS'VF 

AGCCGAGTTCTCGGTGAGGACTTTGTCGGTATTGCTTTCCGCGCTGCCCGCGCCGCTGAT 

SRVLGEDFVGIAFRAARAAD 

CCCGCTGCCAAGCTCTACATCAACGATTATAACCTCGACAAGTCCGACTATGCTAAGGTC 

PAAKLYINDYNLDKSDYAKV 

ACCCGCGGAATGGTCGCTCACGTTAATAAGTGGATTGCTGCTGGTATTCCCATCGACGGT 

TRGMVAHVNKWIA AGIPID G 

ATTGGATCTCAGGGCCATCTTGCTGCTCCTAGTGGCTGGAACCCTGCCTCTGGTGTTCCT 

I GSQGHL AAPSGWNPASGVP 

GCTGCTCTCCGAGCTCTTGCCGCCTCGGACGCCAAGGAGATTGCTATcactgagcttgat 

AA LR ALAASDAKEIAI TELD 

attgccggtgccagtgctaacgattaccttactg-tcatgaacgcttgccttgccgttccc 

I A GA S . A N D YL TVMNAC LAVP 

aagtgtgtcggcatcactgtctggggtgtctctgacaaggactcgtggcgacctggtgac 

KCVGITVWGVSDKDSWRPGD 

aaccccctcctctacgacagcaactaccagcccaaggctgctttcaatgccttggctaac 

NPLLYDSNYQPKA AF NALAN 

gctctgtgagctgttgttgatgtatgtcgctggatcatacaacgaaacgtcctagttgga 

A L . 

taaagcgttgatggtagaatgat 



Fig. 12 
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gaattcgcggccgcctagataagtcactacctgatctctgaataatctttcatcatgaag 

M K 

tctctctcactcatcctctcagccctggctgtccaggtcgctgttgctcaaacccccgac 

SLSLI L S A L A V Q VAVAQTPD 

aaggccaaggagcagcaccccaagctcgagacctaccgctgcaccaaggcctctggctgc 

KAKEQHPKLETYRCTKASGC 

aagaagcaaaccaactacatcgtcgccgaCgcaggtattcacggcattCgcagaagcgCC 

KKQTNYIVADAG I HGIRRS A 

GGCTGCGGTGACTGGGGTCAAAAGCCCAACGCCACAGCCTGCCCCGATGAGGCATCCTGC 

GCGDWGQKPNATACPDEASC 

GCTAAGAACTGTATCCTCAGTGGTATGGACTCAAACGCTTACAAGAACGCTGGTATCACT 

AKNCILSGMDS- NAYXNAGIT 

ACTTCTGGCAACAAGCTTCGTCTTCAGCAGCTTATCAACAACCAGCTTGTTTCTCCTCGG 

TS GNKLRLQQL I NNQLVSPR 

GTTTATCTGCTTGAGGAGAACAAGAAGAAGTATGAGATGCTTCAGCTCACTGGTACTGAA 

VYLLEENKKKYEML 'H L T G T E 

TTCTCTTTCGACGTTGAGATGGAGAAGCTTCCTTGTGGTATGAATGGTGCTTTGTACCTT 

FS FDVEM EK LPC GMNGALYL 

TCCGAGATGCCACAGGATGGTGGTAAGAGCACGAGCCGAAACAGCAAGGCTGGTGCCTAC 

S EMPQDGGKSTS RNS KAGAY 

TATGGTGCTGGATACTGTGATGCTCAGTGCTACGTCactcctttcATCAACGGAGTTGGC 

YGAGYCDAQCYVTPFINGVG 

AACATCAAGGGACAGGGTGTCTGCTGTAACGAGCTCGACATCTGGGAGGCCAACTCCCGC 

NI KG QGVCCNEL D IWEANSR 

GCAACTCACATTGCTCCTCACCCTTGCAGCAAGCCCGGCCTCTACGGCTGCACAGGCGAT 

ATH IAP HPCSKP GLYGCT GD 

GAGTGCGGCAGCTCCGGTTTCTGCGACAAGGCCGGCTGCGGCTGGAACCACAACCGCATC 

ECGSSGICDKAG CGWNHNRI 

AACGTGACCGACTTCTACGGccgcggCAAGCAGTACAAGGTCGACAGCACCCGCAAGTTC 

N VTDFYGRGKQY KVDST RKF 

ACCGTGACATCTCAGTTCGTCGCCAACAAGCAGGGTGATCTCATCGAGCTGCACCGCCAC 

T VTSQFVANKQG DLIELHRH 

TACATCCAGGACAACAAGGTCAtcgagtctgctgtcgtcaacatctccggccctcccaag 

YIQDNKV IESA VVNISGPPK 

atcaacttcatcaatgacaagtactgcgctgccaccggcgccaacgagtacatgcgcctc 

INFINDKYCAATGANEYMRL 

ggcggtactaagcaaatgggcgatgccatgtcccgcggaatggttctcgccatgagcgtc 

G.G TKQMGDAM SR G MVLAMSV 

tggtggagcgagggtgatttcatggcctggttggatcagggtgttgctggaccctgtgac 

WW S E G D FMAWL D Q G VAGPCD 

gccaccgagggcgatcccaagaacatcgtcaaggtgcagcccaaccctgaagtgacattt 

AT EGDPKNIV KV Q PN PEVTF 

agcaacatcagaattggagagattggatctacttcatcggtcaaggctcctgcgtatcct 

SNIRIGEIGS TSSVKAPAYP 

ggtcctcaccgcttgtaaaaacatcaaacaacaccgtgtccaatatggATCTTAGTGTCC 

G P H R L . 

ACTTGCTGGGAAGCTATTGGAGCACATATGCAAAACAGATGTCCACTAGCTTGACACGTA 
TGTCGGGGCAAAAAAATCTCTTTCTAGGATAGGAGAACATATTGGGTGTTTGGACTTGTA 
TATAAATGATACATTTTTCATATTATATTATTTTCAACATATTTTATTTCACGAAAAAAA 
AAAAAAAAAAAAAAAAAAAAAAAA 

Fig. 13 



REPLACEMENT SHEET 



WO 91/17244 PCT/DK9 1/00 124 



14/22 



10 20 30 40 50 60 

i i ! 1 1 ' 

TTTCTTCGTCGAGCTCGAGTCGTCCGCCGTCTCCTCCTCCTCCTCCTTCCAGTCTTTGAG 
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